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Description 

The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments, probes, primers and related polynucleotides 
5 thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others. 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R. R, The Staphyloco- 
ccus as a Molecular Genetic System. Chapter 1, pgs. 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R. Novick, Ed., VCH Publishers, New York (1990)). Species differ from one another by 80% or more, by hybridization 
to kinetics, whereas strains within a species are at least 90% identical by the same measure. 

The species Staphylococcus aureus, a gram-positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection In humans, as discussed briefly below. 

Human Health and 3. Aureus 

15 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et ai, MEDICAL MICROBIOLOGY, 
Mosby-Year Book Europe Limited, London, UK (1993)). It is an etiological agent of a variety of conditions, ranging In 
severity from mild to fatal. A few of the more common conditions caused by S. aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint Infections, neonatal conjunctlvitis.osteomyelltis, skin infections, surgical wound 
20 Infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below. 

Burns 

Burn wounds generally are sterile initially. However, they generally compromise physical and immune barriers to 
25 Infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable bacteria results in mixed colonization at the injury site. Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and It may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S. aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
30 produce severe septicaemia. 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread below the cutaneous 
55 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenes. Cellulitis can lead to systemic infection. 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically Is caused by a mixture of 
S. aureus and microaerophilic streptococci. It causes necrosis and treatment is limited to excision of the necrotic tissue. 
The condition often is fatal. 

40 Eyelid infections 

S. aureus is the cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

45 Food poisoning 

Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E). Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria. 
50 Although the toxins are known, their mechanism of action is not understood. 

Joint infections 

S. aureus infects bone joints causing diseases such osteomyelitis. 

55 

Osteomyelitis 

S, aureus is the most common causative agent of haematogenous osteomyelitis. The disease tends to occur in 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

5 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
10 Recunrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body Infection of such wound thus poses a grave risk to the patient. 
IS s. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S, aureus strains 
25 produce exfoliation(also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost In the process can produce severe injury in young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
3S disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital sen/ices, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to dnjgs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
45 Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully The emergence of penicillin-resistant strains of S. aureus did not take long, however Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
so pencillinoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance. 

Methicillins, introduced in the 1960s, largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
ss make penicillin a good substrate for inactivating lactamases. However, methicillin resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicill in-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon ©fa/.. Micro- 
biology Reviews 5^^^. 88-1 34 (1 987)). Generally, resistance Is mediated by plasmids, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently Involve Integration 
of a resistance element into the S. aureus genome Itself. 
s Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 

and Increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are lil^ety to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

10 

Despite its Importance In, among other things, human disease, relatively little Is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psill psl12 and psIlS, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
IS the prophages. 

These studies revealed that the S. aureus genome, tike that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et al. published a low res- 

20 olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et aL 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11 , pgs. 163-16& in, 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P Novick, Ed., VCH Publishers. New York, (1990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001 , which, respectively confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 

2S The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost ail of the genes of Staphylococcus aureus are unknown. Among 

30 the few genes that have been identified, most have not been physically mapped or characterized in detail. Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy2X 
SuppI C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 

35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controlling S. aureus infections. 

40 There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
45 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted In SEQ ID NOS:1-5,191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
50 preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

The nucleotide sequence of SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS:1-5,191 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
55 to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
5 aureus genome of the present Invention Include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs,° fragments which modulate the expression of an operably linked ORF, 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs." 

Each of the ORFs In fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
10 found 5' to the ORFs, can be used In numerous ways as polynucleotide reagents. For Instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
IS lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the Isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present Invention. For Instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The Invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PGR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above -described antibodies. A hybridoma is an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present Invention further provides methods of Identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
40 or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine If the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
Invention; and (b) one or more other containers comprising one or more of the following:wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an Isolated protein encoded 
by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Staphylococcus aureus wilt be of great value to all laboratories working with 
this organism and for a variety of commercial purposes. Many fragments of the Staphylococcus aureus genome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value 
to Staphylococcus aureus researchers and for Immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, 
Including the ability to Identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
5 genomic and molecular phylogeny 

FIGURE 1 is a block diagram of a computer system (1 02) that can be used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present Invention. Both Macintosh and Unix 

10 platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al., Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer So- 
ciety Press, Washington D C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database. As- 

is sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file Is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 

20 generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) Is accomplished by processing contigs with zorf. The ORFs are searched against S. aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et 
al., J. Mol. Biol. 215 : 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1-3.. 

25 The present invention Is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 

of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided In SEQ ID 
NOS:1-5,191. (As used herein, the "primary sequence" refers to the nucleotide sequence represented by the lUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
30 present invention provides the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 , or representative fragments thereof, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-5,191 ° refers 
to any portion of the SEQ ID NOS:1-5,19l which is not presently represented within a publicly available database. 
Preferred representative fragments of the present Invention are Staphylococcus aureus open reading frames ( ORFs"), 
35 expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs"). A non-limiting Identification of preferred representative fragments Is provided In Tables 
1-3. 

As discussed in detail below, the information provided in SEQ ID NOS:1-5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments" of Interest, Including open reading frames encoding a large variety of Staphylococcus 
aureus proteins. 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, In relatively rare Instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed In SEQ ID NOS:1-5.191. However, once the 

45 present invention Is made available {i.e., once the Information in SEQ ID NOS:1-5,191 and Tables 1 -3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1-5,191 will be well within the skill of the art. The present 
disclosure makes available sufficient sequence Information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed In like manner using manual and automated sequencing methods which are employed ubiquitous in the 

so art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resulting nucleotide 

55 sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1 -5,191 . 

As discussed elsewhere hererin. polynucleotides of the present invention readily may be obtained by routine ap- 
plk:atlon of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus auretis strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC"). 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 
s ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:l-5,191. Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus, the present Invention further provides nucleotide sequences which are at least 95%, preferably 99% and 
most preferably 99.9% Identical to the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 , In a form which can be readily 
10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USAS5: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 
IS an Identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS:1 -5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS:1 -5.191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
dprovided" refers to a manufacture, otherthan an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1-5,191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 

25 to a polynucleotide of SEQ I D NOS: 1 -5, 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof {e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining. the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media Include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

3S prising computer readable medium having recorded thereon a nucleotide sequence of the present Invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

40 to generate manufactures comprising the nucleotide sequence information of the present Invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

45 readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats {e.g., text file or database) In order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

so Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS:1 -5.1 91 , a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS: 1-5, 191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes. 

ss The examples which follow demonstrate how software which implements the BLAST (Altschul et al, J. Mol. Biol. 

215:403410 (1990)) and BLAZE (Brutlag etal., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology toORFs or proteins from both Staphylococcus aureusar\6 from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer-based systems, which contain the sequence 
Information described herein. Such systems are designed to identify, among other things, commercially important frag- 

s ments of the Staphylococcus aureus genome. 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available 

10 computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 

IS present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 

20 niatch a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software Includes, but is not limited to, MacPatlern (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. 

25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence In the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 

30 protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

35 to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences): 

A variety of structural formats for the Input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 

40 target sequence or target motif and identifies the degree of homology contained in the Identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschul etaL, J, MoL S/o/. 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 

45 readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
Invention. The computer system 1 02 includes a processor 1 06 connected to a bus 1 04. Also connected to the bus 1 04 

50 are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROI\/l drive, a magnetic tape drive, etc. A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes 

55 appropriate software for reading the control logic and/or the data from the removable medium storage device 1 1 4, once 
it is inserted into the removable medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, 
any of the secondary storage devices 110. and/or a removable storage medium 116. During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

5 BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to Isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
15 means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out In SEQ ID NOS:1-5,191, to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
20 include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them Into 
lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
25 known and routine techniques of PGR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS:1-5,191. the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1 -5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containIng or 
other nucleic acid fragment of the present invention. 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables 1, 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
40 an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996. 

Table 3 sets out ORFs in the Staphylococcus aareus contigs of the present invention that do not match significantly, 
45 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1996. 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 
50 In Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information. Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
55 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 

In Table 3, the last column, column six, indicates the length of each ORF in amino acid residues. 

The concepts of percent identity and percent similarity of two polypeptide sequences is well understood In the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g.. at positions 



9 



EP 0 786 519 A2 



1 , 3 and 5) are said to have a percent Identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not Identical, were ■similar" 
{i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter. Thus, 

5 for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 

10 artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF In addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 

IS As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 

20 the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 

25 of the present Invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotk: 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector Is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 

30 expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF. means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily Identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence In an appropriate diagnostic format which de- 
40 termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided In SEQ ID NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1 -5, 1 91 . with a sequence 
45 from another isolate of the same species. 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino add is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
50 such as an ORF, in both directions {i.e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureusongin isolated by using part or ail of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotide reagents In numerous ways. For example, the sequences can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3. which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DN A or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DN A or RNA. Triple helix- fomriation optimally results in a shut-off of RNA transcription from DN A, 

s while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present Invention can be used to design antisense and triple helix-forming oligonucleotides. Polynu- 
cleotides suitable for use In these methods are usually 20 to 40 bases In length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 

10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et ai, Nucl. Acids Res. 6: 
3073 (1979); Cooney et ai, Science 456 (1988); and Den/an et al, Science2BV. 1360 (1991). Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56;, 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION. CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 

IS lococcus aureus genomic fragments and contigs of the present Invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been Inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORR For vectors comprising the EMFs of the present invention, the vector may further 

20 comprise a marker sequence or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagescript, PslX174, pBluescript SK and KS (+ and ■), pNHSa, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (available 

25 from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat. pOG44, pXTI , pSG (available from Stratagene) 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate eariy. HSV 

30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1 . Selection of the appropriate 
vector and promoter is well within the level of ordinary skill In the art. 

The present Invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been Introduced into the 
host cell using known methods. The host cell can be a higher eukaiyotic host cell, such as a mammalian cell, a lower 

35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfectlon and electroporation, which are described in, 
for instance, Davis, L etai, BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

^0 A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(In the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 

45 Intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs depicted In Tables 2 and 3 which encode 
proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
50 of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particulariy useful In producing small peptides and fragments of larger polypeptides. 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, 
or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, bn-exchange chromatography, and immu- 
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no-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
5 does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eul<aryotic or prokaryotic celts 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any hostA^ector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 
10 host such as E. co// and B. subtHis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

"Recombinant," as used herein, means that a polypeptide or protein Is derived from recombinant {e.g., microbial 
or mammalian) expression systems. "Microbiar refers to recombinant polypeptides or proteins made in bacterial or 
fungal {e.g., yeast) expression systems. As a product, "recombinant microbiardefines a polypeptide or protein essen- 
T5 tially tree of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" refers to a plasm id or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1 ) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and. where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally The cells can be prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed In mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

40 from the DNA constructs of the present Invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal., MOLECULAR CLONING: A LABORATORY MANUAL, 2"«* Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

45 transformation of the host cell, e.g., the ampicillin resistance gene of £ co// and S. cerevisiae TRP^ gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

so of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

55 functional promoter The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coll, B. subtilis, Salmo- 
rtelia typhimurium and various species within the genera Pseudomonas, Stmptomyces, and Staphylococcus. Others 
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may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transfonnation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or Induced by appropriate means (e.g., temperature shift or 
10 chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter ceils are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract Is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems Include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman. Cell 23: 175 
IS (1981), and other cell lines capable of expressing a compatible vector, for example, the CI 27; 3T3. CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous Ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

2S cycling, sonication. mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces* 
sary, in completing configuratbn of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the Invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively "Immunologically useful polypeptides". Such im- 

30 munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et al., Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with smalt side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S., J. Bac- 
teriol. 174, 7345-7351; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

45 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

so contained the sequence L-(A,S)-(G. A)-C at positions -3 to +1 , relative to the point of cleavage (HayashI, S. and Wu, 
H. C. Lipoproteins In bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
sented carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy tenminal 

55 amino acid sequence (Fischetti, V. A. Gram-positive commensal bacteria deliverantigenstoelicit mucosal and systemic 
immunity. ASM News 62, 40541 0; 1 996). The consented region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved In nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid. 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative , 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein", "periplasmic". or "antigen". 

5 An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 

criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
ID NOS:5,1 92 to 5,255. Thus the amino acid sequence of each of several antigen icStep/iy/ococcus aureus polypeptides 

10 listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1. 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 

IS in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5. 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 

20 amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5, 192-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 

25 ORF listed in Tables 1 ,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 

30 amino acid sequences shown in the sequence listing as SEQ ID NOS:5, 191 -5,255. or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 

3S polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" Is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See. for 

40 instance, Geysen et al., Proc. Natl. Acad. Sci. USA 81:3998- 4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutcliffe, J. G., Shinnick. T M., Green, N. and Learner. R. A. (1983) "Antibodies that react with 

45 predetermined sites on proteins". Science. 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 

so Wilson et al., Cell 37:767-778 (1 984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non-limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

55 4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means. 
See, e.g., Houghten, R. A. (1985) General method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. Sci. USA 82: 
5131-5135; this "Sinnultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known In the art. See, for instance, Sutcliffe et al., supra; Wilson et al.. supra; 

s Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and BIttle, F. J. et al., J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e. , those parts of a protein that elicit an antibody response 
when the whole protein Is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen et al.. supra. Further still, U.S. Patent No. 5,1 94,392 to Geysen (1 990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 

10 epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
Interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a llgand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 6,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear CI -C7-alkyl peralkylated oligopeptides and sets and libraries of such 

IS peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the Invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame Included within the epitope and y Is the number of the last amino acid 
In the open reading frame included within the epitope. For example, the first epitope In ORF 168-6 Is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

2S polypeptides comprising an amino acid sequence of one or more antigenic regions Identified In Table 4. The Invention 
further provides polynucleotides encoding such polypeptides. 

The present inventbn further includes Isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if It shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 

40 sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/J>laque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 

45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood 
that, among measures of homology, Identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1-5,191 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:1-5,191 can be used to prime DNA synthesis and PCR amplification, as well as to Identify colonies containing 
cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as. for example, Innis etal., PCR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

55 When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS:1-5,191 , one skilled in the art will recognize that by employing high stringency 
conditions {e.g., annealing at 50-60**C in 6X SSPC and 50% formamide. and washing at 50- 65*C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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conditions [e.g., hybridizing at 35-37*C In 5X SSPC and 40-45% formamide, and washing at 42'*C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:1-5,191 , or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS:1 -5,1 91 , for colony/plaque hybridization, one skilled In the art will recog- 

5 nize that by employing high stringency conditions (e.g., hybridizing at 50- 65^C in 5X SSPC and 50% formamide, and 
washing at 50- 65*C in 0.5X SSPC). sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C In 5X SSPC and 
40-45% formamide, and washing at 42°C in 0.5X SSPC). sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

10 Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

IS 

Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
skilled in the art to use the Staphylococcus aureus ORFs in a manner similar to the known type of sequences for which 

20 the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example. BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications. Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper ef a/., Eds.. Elsevier 
Science Publishers, Amsterdam. The Netherlands (1985). A variety of exemplary uses that illustrate this and similar 

2S aspects of the present inventbn are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions Involved in intermediary and 

30 macromotecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in 
amino acid synthesis, enzymes involved In nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 

35 can be used for Industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS:1-5,191. 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes Include amylases, glucose oxidases, and catalase. 

40 Proteolytic enzymes are another class of commercially Important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and In the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
era/., Symbiosis2V. 79 (1986) and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY. Whitak- 

45 er ef a/., Eds., American Chemical Society Symposium Series 389: 93 (1 989) . 

The metabolism of sugars Is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as. particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 

50 oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described In Krueger ©fa/., Biotechnology 6(A) . Rhine etal., Eds., Verlag 
Press, Weinheim, Germany (1984), 

Glucose oxidase (GOD) Is commercially available and has been used in purified form as well as in an Immobilized 
form for the deoxygenation of beer. See, for instance, Hartmeir et al, Biotechnology Letters V. 21 (1979). The most 

55 important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis et ai, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al, Eds.. 
Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
eel lu lose hydrosy lates. This application is described in Owusu et al., Biochem. et Biophysica. Acta. 872: 83 (1 986), for 
instance. 

The main sweetener used in the world today Is sugar which comes from sugar beets and sugar cane. In the field 

s of Industrial enzymes, the glucose isomerase process shows the largest expansion in the market today Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger etai, Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using Immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 

10 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman etai, Acid Proteases Structure Function and Biology, Tang, J., ed.. Plenum Press, New York (1977) and 
Godfrey etai. Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner etai, Report Industrial En- 

15 zymes by 1 990, He( Hepner & Associates, London (1 986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae etai, Philosophical Transactions of the Chiral Society of London 310:227 (1 985) and Poserke. Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed Inter-esterification of readily available triglycerides. Application 

^0 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 

organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et ai, Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton. Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysls of carboxylic acid esters* 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

30 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present Invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et ai. Chemistry in Britain 

35 (1987), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of aminotransferases for amino acid production is provided by Rose lie-David, Methods 
40 of Enzymofogyi^A79 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l. 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
50 either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducing the desired antibody are well known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Elsevier Science Publishers. Am- 
sterdam, The Netherlands (1984); St. Groth etai, J. Immunoi Methods 35: 1-21 (1980). Kohler and Milstein, Nature 
256 : 495-497 (1975)), the trioma technique, the human B- cell hybridoma technique (Kozbor etai, Immuriotogy Today 
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4: 72 (1983). pgs. 77-96 of Cole et al, in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. 
(1985)). 

Any animal (mouse, rabbit, etc) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for Immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
5 injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but 
»o are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to Identify the hybridoma celt which produces 
an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, westem 
blot analysis, or radioimmunoassay (Lutz etal., Exp. Cell Res. 175: 109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 
20 Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778)-can be adapted to 

produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and Is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
25 be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin. etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger etaL, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. etal, Meth. Enzym. 62:308 (1979); Engval, 
E. etal., Immunol. 109:129 (1972); Coding, J. W. J. Immunol. Meth. 13:215 (1976)). 
30 The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 

or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
3S are well known in the art (Weir, D. M. et al., "Handbook of Experimental Immunology" 4th Ed.. Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. etal., Meth. Enzym. 34 Academic Press, N. Y. (1974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
Immunoaffinity purification of the proteins of the present invention. 

40 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs,antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 

45 the DFs. or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay One skilled in the art will recognize that any one of the commonly available hybridization, amplification 

so or immunological assay formats can readily be adapted to employ the Dfs. antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. etaL, Techniques in 
Immunocytochemistry. Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, R, Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291 . and 

55 Molecular Biology. Elsevier Science Publishers, Amsterdam, The Netherlands (1 985), all of which are hereby incorpo- 
rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above-described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted In order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

s out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises: (a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the followingiwash reagents, reagents capable of detecting 
presence of a bound DF. antigen or antibody. 

10 In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 

containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 

IS contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, eta), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 

20 present invention can be readily Incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
2S and identifying agents which bind to a protein encoded by one of the ORFs of the present Invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one 
40 skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents andthe 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et aL, Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H, Freeman, NY (1 992), pp. 289-307, and Kaspczak ©fa/.. Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

4S In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

so One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RN A. Such agents can be based on the classic phosphodiester. ribonucleic acid backbone, or can 
be a variety of sutfhydryl or polymeric derivatives which have base attachment capacity 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee etai, Nucl. Acids Res. 6:3073 (1979); Cooney 

ss etai, Science 241 :456 (1988); and Dervan etal, Science 251: 1 360 (1991)) or to the mRNA itself (antisense - Okano. 
J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA. while antisense 
RNA hybridization blocks translation of an mRNA molecule Into polypeptide. Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

5 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path* 
ogenicity of Staphylococcus aureus, or another related organism, in v/Voor in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical 

10 agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro, " when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 

IS of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents wilt modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 

20 of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have Identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
2S can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
30 routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day In most 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

35 The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 

As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in. among other sources, 

40 REMINGTON'S PHARMACEUTICAL SCIENCES (1 980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 

45 also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means {e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally. or parenteral ly). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 

50 preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
55 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

5 The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 

When provided prophylactically, the agent (s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
Indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10 toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutlcally acceptable form and In a therapeutically effective concentration. A composition Is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 

'5 Ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutically 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, a 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16*^ Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutically acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention Into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacen/ation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatinennicro- 
capsules and poly (methylmethacy late) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, microemulsions, nanopartlcles, and nanocapsules or in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the Invention. Associated with such container(s) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

40 or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow The examples 
50 are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
Inventors, as will be clear to those of skill In the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 
5 1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, Pq. that any given base in a sequence of size L, in nucleotides, is not sequenced after 
10 a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation Pq 
= e'"^, where m is L/n, the fold coverage." For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has 
been randomly generated (1 X coverage). At that point, Pq = e'i = 0.37. The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, 
IS approximately 37% of a polynucleotide of size L. in nucleotides has not been sequenced. When 14 Mb of sequence 
has been generated, coverage is 5X for a .2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage 
of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with 
an average sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le*™, and the average gap size, g, follows the 
20 equation, g = Un. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1 988). 

2. Random Library Construction 

25 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end 
Staphylococcus aureus DNA was prepared by phenol extraction. A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate, 10 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min. at O^C in a Branson 

30 Model 450 Sonlcator at the lowest energy setting using a 3 mm probe. The sonicated DNA was ethanol precipitated 
and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 100 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°0 in 200 ul BAL31 buffer . The digested DNA was phenol-extracted, ethanol-pre- 
cipitated. redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting 

35 temperature agarose gel. The section containing DNA fragments 1.6-2.0 kb in size was excised from the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector. 

A two-step ligation procedure was used to produce a plasm id library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC18 DNA (Pharmacia) cut with Smal 

40 and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i). vector (v), v-fi, 
v+2i, v+3i, etc. The portion of the gel containing v-hi DNA was excised and the v+i DNA was recovered and resuspended 

45 into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37* C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ui reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at M^C overnight. After 10 min. at 70**C the following day. the reaction mixture was 

so stored at -20^C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.co// host cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements, 
55 deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1.42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 nnM. Cells were Incubated on ice for 
10 min. A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42* C and placed back on Ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transforrriation 

s mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0,4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%), 1 ml MgCl2 (1 M), and 1 ml MgSO4/100 ml SOB agar. The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ui aliquot of transformation. 

10 All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

IS 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with SPrime -» 3Prlme Inc. (Boulder, CO). Plasmid preparation was performed In a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x109 ptu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufactureer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulfoxide. The phage titer Is approximately 1x10^ pfu/ml. 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and T7 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the Ml 3 forward (Ml 3-21) and the Ml 3 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. AN of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for Ml 3-21 and Ml 3RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for Ml 3-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. Protocol for Automated Cycle Sequencing 

The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyciers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-allquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 
50 primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e.., one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e., 
DNA synthesis. A heated lid with rubber gaskets on the themnocycling plate prevents evaporation without the need for 
an oil overlay. 

55 Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 

terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined Into one lane of the 373 or 377 DNA Sequencer for electrophoresis, detection, and base- 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing ail the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

5 Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 

per day. Electrophoresis was run overnight (ABI 373) or for 2 1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 perfomr^s automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 

10 itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

15 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage etal., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 

20 Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates alt information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 

25 on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

30 An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 

fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10^ fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. 

3S Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith -Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., 
Methods in Enzvmoloav 1 64 : 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 

40 end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial misnr^tches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size inforrmtion 

45 coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

so 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity Those ORFs with nucleotide sequence 
55 matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. 
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ORFs of at least 1 20 amino acids that did not matcii protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known In the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 E. coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by IHybridoma Fusion 

IS 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstein. C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (IHAT media). The successfully fused 
cells are diluted and aliquots of the dilutbn placed in wells of a microtiter plate where growth of the culture is continued 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by Immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and modified meth- 

25 ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L etal. Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1 989). 

3. Polyclonal Antibody Production by Immunization 

30 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenlclty. Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

3S carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antlgenadmlnlstered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. GtaL, J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchterlony, O. etal., Chap. 19 in:Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mgAnl of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher. D., Chap. 42 in:Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

so the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS: 1 -5, 1 91 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately the same. The PGR primers and amplified DNA of this Example find use in the Examples that follow 
4. Gene expression from DNA Sequences Corresponding to ORFs 

s A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 

10 proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield etai, U. S. Patent No. 5,082,767, incorporated herein by this reference. 

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 

15 Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXTI contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transf ection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for PstI incorporated into the 5' primer and Bglll at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with PstI, blunt ended with an exonuclease, digested with Bglll, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 

2S The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 

New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Altemativly and If antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chlmerk; protein. Corresponding protease cleavage sites are engineered between the 

55 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

^0 texts such as Davis et al., cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene. Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in wfro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

^ scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference. 
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! 
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30 


771_1 j 










999_1 ! 










853_1 1 
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35 
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171_n • 


1 




40 
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1 
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353_2 1 


1 




1 
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342_4 1 
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45 


70 6 ! i ! ! 




129_2 : i : . 




58_5 i 1 
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SO 


236_6 i 1 
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; 429-438 
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Table 4 



ORF i Antigenic i Regions !{cont) i 


i Region 23 Region 24 ! Region 25 ! Reaion 26 : Region 27 i Region 28 


168.6 : ! ! t ! 




238^1 : 1 ! ■ i 


51_2 ! 




! 


- - 


278_3 i 




1 


276.2 ! 




"! 1 

s i 




4^ 4 ! 






_ . . .... 


3 1 6.8 


i ! 




154.15 1 








228.3 • 








228.6 ! 




1 




50.1 ! 




i 


.... 


1 1 2_7 1 




i 




442.1 i 




i 




66_2 i i 




1 




304.2 ! ; 




1 


- 


44_1 i 






161.4 : 




1 




46.5 ! 


i 






942.1 i 











5.^ ! 








"* "— ~ 


20.4 ! i 










328.2 ! 










520_2 i 




1 




771.1 i 








999.1 ! 




! 




853.1 1 








287_1 ! 









288_2 ' 






S96_2 


1 






217_5 ! ; 1 


217.6 ! 






528^3 i • i 






171^11 : ! 






63_4 1 ! i 


1 




353_2 ! 1 


1 

• 




743_1 : ' i 


i 




342_4 


1 




69.3 i 




70^6 




129^2 ' 


58.5 


188_3 


236.6 =1 1 , 


310_8 622-632 670-685 1708-71 8 823-836 858-867 1877-886 


601.1 i 1 ! 


544.3 . i I ' ! 


652 1 ; i 


87 7 1 ! • ! 


120.1 * 
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Table 4 



10 



IS 



20 



25 



30 



35 



40 



45 



SO 



ss 



ORF Antigenic! Regions ;(cont) 


Region 29 ! 


Region 30 i 




168_6 i i 


238^1 i 1 


5U2 1 : 


278^3 1 j 


276.2 i i j 




45^4 1 ! ! 




316_8 






154_15 ■ ' 






. 
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— - 






228_6 






. ■ n 

50.1 


r - 




1 1 2.7 






442.1 ! 






66.2 i 






304_2 






44.1 






161.4 






46.5 : 
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20.4 I 
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771.1 i 
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853.1 i 






287^1 






288^2 






596.2 • i 
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217.6 t i 


— 


528.3 1 




171.11 ! ! 




63.4 1 1 




353.2 . ! i 




743L1 ' 




342_4 1 1 




6913 i ' 


70.6 ! 


129.Z ! 


58.5 


188.3 


236.6 i 


310.8 


601.1 


544.3 






652.1 


87„7 






1 20.1 



212 



EP 0 786 519 A2 



Table 4 



ORF 




□ 1 ACT 

dLAoT 


Antigenic j Regions 


< 








HUMULUb 


Region 1 


Region 2 


_JRegion^3 \ 


. 

..-?.®9l9P 4 


46^1 


i5241 


aldehyde dehydrogenase 


8-17 


36-52 


83-90 ! 


_ 11 21^ 


63_4 


5242 


glycerol ester hydrolase (P. 


9:26 


57-7_3 


93-107 i 


1 237! 33 


174_6 




5243 ketopantoate hydroxymeth 


71-80 


_.203i2l2_ 


242-254 J 


265^274 


206_16:5244 


^ ornithine acetyltransferase -.^ 


1-10 


2^34:43_ 


54-63 ; 


J 94-210 


267.1 


i524S 


NaH-antiporter protein (E. 1" 


1 20-129 


332-347 


398-408 




322^1 


15246 


acriftavin resistance protein 


58-75 


153-164 


203-231 1 


264-284 


41 5_2' 


;5247 


transport ATP-bindIng prot< 


108-126 


218-227 


298-308 ; 


315-334 


214_3 


!5248 


2-nitropropane dioxygenas€ 


123-136 


216-233 


283-292 T 


297-306 


587.3 


15249 


clumpinq factor 


5-14 


43-54 


59-68 I 


76-95 


685.1 


;5250 


signal peptidase 


59-68 


72-81 


86-95 ! 


_J9-108 


54.3 


15251 


fibronectin binding protein 1 


23-32 


37-46 


50-59 J 


89-98 


54_4 


"=5252 


fibronectin binding protein 1 


""^43-52 


66-75"" 


95-104 • 




54l'5' 


15253 


fibronectin binding protein 1 


49-60 


81-90 "1 


1 




54.6 


'5254 


fibronectin binding protein 1 


55-71 


82-97 


139-158 i 


"175-1^ 


328.1 


J 5255 


lipoprotein (H. flu) 


11-20 


61-70 


1 96-105 1 
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Antigenic; Regions 
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Region 6 i 


Region 7 


Region 8 1 
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46.1 
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63.4 
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1 




206.16 
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1 






2S7_1 1 i 
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1 


• 
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371-380 
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486-495 




214.3 \ 


31_8:337 _ 
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1 06-1 15 ' 


142-151 
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185-198 
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685.1 : 


113-122 


130-145 










54.3 
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217-226 


251-260 
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295-305 


54.4 ^ 


175-188 


191-200 


203-212 


220-229 
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54.5 


220-230 
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364-373 


378-387 


328.1 
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ORF ! Antigenic! Regions !(cont) ? ! 


: Region 1 1 


Region 1 2 ' Region 1 3 


Region 14 1 Region 15 Region 17 
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i 




63.4 ■ 306-315 


319-328 i 366-376 


395-420 


453-462 467-476 


174_6 










206_.16 ; 








i 


267_1 i 1 






! 


322_1 i \ 
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214^3 ' ! 


: i 


587_3 ' 217-226 
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332-342 1351-360 1377-386 
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.... 


1 j ...... 


5~4l3' r 31 6-325 


329-345 
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I 


54_5 i 1 






1 
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328.1 i 1 


! 
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685.1 


1 








54.3 


455-462 


472-491 


"517-536 
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r— ■■■ 
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54.5 
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673-681 


703-715 


723-732 


749-760 
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Antigenic Regions 


(cont) 1 
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Region 25 ! Region Z6 
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Region 28 


Jlegiqn_29_ 


""46 Jl" 
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"i 


63^4 i 
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„ ,. 


174_6 i 




i 


_ . , _ . 


206_16 ! i 




1 


. ... 


^S7_^ \ 1 




1 




322^1 
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, 





415.2 




1 




214^3 












587.3 


567-578 


584-601 i 607-840 


844-854 


858-870 


877^8863" 


685_1 j 
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54.4 


! 








54_5 : 


1 

U- ,-. — 








54.6 


•793-802 


811-826 834-848 


866-876 
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328.1 ' ! ! 
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Region 30 ' Region 31 1 


46_1 






63_4 




174_6 




206_16 ! 




267_1 1 : 




322_1 1 




415.2 


*- 


214_3 1 I 




587_3 : 889-911 . 927-936 




685_1 1 ! 




54_3 i 






S4_4 i 






54_5 1 






54_6 : 925-944 


951-997 




328_1 1 1 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 

(v) COMPUTER READABLE FORM: 

(A> MEDIUM TYPE: Diskette, 3.50 inch, 1.4 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05-JAN-1996 



(2) INFORMATION FOR SEQ ID N0:1: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



IS 



2S 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTT T TT AGTTGCTGCA GGAGCATATA 300 

CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 360 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 420 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 480 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 540 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 720 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 780 

3S TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 840 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 1020 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 1080 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 1140 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 1260 

SO TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 1380 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 1440 

55 



40 



4S 
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TCATTTGCAA AOGGCGAAAT GGGTTCTTAC 

GACTTATGAA AAATCTCTCA TAAATCTATG 

5 

CGGGCGCTTC TTATTTATAC AAATCTAATT 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA 

10 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA 
CCTTTAGTAA TGAATGCCAA TGTTGAAGAC 
IS CAATTTTAGC ACCACTACGA GCAGGATATC 
AGATAAGTTG GGTCACATAA GTACCGTCAG 
CAGCGGTTAA AAATTCATGC TCTCTTAACA 
TAATCAGCTG GCCACTTTTA TTCACACTAA 
TGTTAGAAAC TGACCCAGCT ACGATATCGA 
ATGAAAAACT TTGTCCTAGA TTATTATCTG 

25 

CATTATTCCA CATTTCAGGT AAAACGACTA 
ACCATTGCGT TATTTGAGTT TCATTTTTAG 

3^ AAATTTGGAC TTTCATAACA TCACATCCTT 
TATGTTGAAA CGCAAAAAAC GAGCACAAGA 
TTATATTGAC AGTAGTTGAT GGGGCCCCAA 

^ GACAATGCAA GTTGGGGTGG GCTCTAACAT 
TTTCTTATAC ATGAGTTTTA CTCATGTATT 
AATGTGTAAG AACTACTACA TAATGAATAA 

40 

CCTAACAATA TATTGATTAT TTTTTTATTA 
TTTCGCCAGC AGCTTCACGA ATATCACCAA 
TAGGAATATT AAATTCATTT GAAGTCATCT 

45 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG 
TTTGAGTTGA CCATATTGCA AAATTATCAT 
50 TTACAACATC TTGATCTTCA TAAAACAAAA 
TTTTTTGTTC AGTTGGCTCG AAATCACGAT 
TTGTGTTATC CCAAAATTTA TTATTGTTGT 

55 



TGAGTTATCT ATTATAAAAA AATAAACATA 1560 

TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

TAATACTTTT AAATACAGGT ATATTTTCgC 1680 

CTACAGTCAA AATATCTGCG GATTCATTTA 1740 

TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TGACAAACAT ATTATTTTCA ATTGCACGTG 1860 

GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

AACTATCTCC AAAAACAATC GGTAATTGAT 2400 

GATAGATCTT ATATATAACT TACTAAAAGT 2460 

CATAAAATCA AAGTCCTAGG CTCTACAAAG 2520 

CATAGAGAAA TTGGAACACC AATTTCTACA 2560 

AAA6AAATAC TTTTTCTTTA GAAATTAGTA 2640 

CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CGAAACGATC TTCCACTGGA TTAAATGTTT 2820 

ATGGCATTTG AGCAATAAGT TTCCAACTTT 2880 

CATCAACAAG TGGATTATA6 TGTTGTAATG 2940 

TCCAAATTGC AAATTGATGC ATGGCATTTC 3000 

AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3060 

TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

CATTTAACAA GAGAACAATT CTAGTTGATT 3240 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT 
TGTCAAAAGT CATTGCTTTT TTATCTTTTT 
^ TAGTAAAGAA TACTTAATAG ACTAAGTATA 
TTACGAAAAT TTCAAGAATA TTAATATTCA 
AACGCATATT TATTATACTT AGATTAATAC 

10 

TGTCATATCA TTGGTTTAAG AAAATGTTAC 
GTAGTTTAGG GCTTGCAACG CACACAGTTG 
IS CAACTACTAA TTTGAATCAT AATATAACTT 
ATGAGACTGG GACACCTCAC GAATCAAATC 
GTCGTGATGC TAATCCTGAT TCX5AATAATG 
GTACAGATTC AAAACCAGAC CCAAATAACC 
CAGATAACCC GAAACCAAAA CCX3GATCCAA 
CGGATCCAAA ACCAGATCCA GATAACCCGA 

25 

ATAAACCAAA GCCAAATCCG GATCCAAAAC 
ATCCAAAACC AGACCCTAAT AAGCCAAATC 
GGGATTCCAA TCATTCTGGT GGCTCGAAAA 
ATGGATCTAA TCAAGGTCAA TGGCAACCAA 
CTGGTAATGA TTTTGTATCC CAACGATTTT 
35 ATCCGTATAT TTTAAATCAA ATTAATAAGT 
AAGACATTTA TAATATTATT CGAAAACAAa 
TAO^CAGCA ATCGAATTAC TTTAGATTCC 

40 

ACTATCGTAA TTTAGATGAA CAAGTACTCG 
CAGATTTGAA AAAGCCCGAA GATAAGCCGG 
AAAAAGACGA TTTTACAGTA GTTAAAAAAC 

45 

CATATAGTAA AAGTTGGCTA GCAATTGTAT 
TATTCTTATT TGTAAAGCGA AATAAAAAGA 
SO CCGTGTGTGA TTCGTTTTTT TTATTATGGA 
TCCGTGGCTT TTTTCAAAGC CTCAGGATTA 
TGTAACATAT GGATAATAAT TGGAACAGCA 

SS 



TATATATTGA ACGTCTTTCT TCCATTGCAT 3360 

TAAATAAGCC CATAATTATT GCTCCTTCTT 3420 

AAATTTATAC TCGTACTTGT AAAGCAATAT 3480 

TTTTCAAATT CCAAATATAA ATGCATTTTC 3540 

TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3600 

TTTCAACAAG TATTTTAATT TTAAGTAGTA 3660 

AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3720 

CACCATCAGT AAATAGTGAA ATGAATAATA 3780 

AAACGGGTAA TGAAGGAACA GGTTCX3AATA 3840 

TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3900 

AAAACTCAAG TCCGAATCCT AAACCAGATC 3960 

AACCAGACCC AGATAAACCA AAGCCAAATC 4020 

AACCAAATCC AGATCCAAAA CCAGACCCAG 4080 

CAGATCCAGA TAAACCAAAG CCAAATCCGA 4140 

CTAACCCGTC ACCAGATCCC GATCAACCTG 4200 

ATGGGGGGAC ATGGAACCCA AATGCTTCAG 4260 

ATGGGAATCA AGGAAACTCA CAAAATCCTA 4320 

TAGCCTTGGC AAATGGGGCT TACAAGTATA 4380 

TGGGCAAAGA TTATGOAOAA GTTACTGATG 4440 

ATTTCAGCGG AAATGCATAT TTAAATGGAT 4500 

aATATTTCAA TCCATTGAAA TCAGAAAGGT 4560 

CATTAATTAC TGGTGAAATT GGATCAATGC 4620 

ATTCAAAACA ACGCTCATTT GAACCGCATG 4680 

AAGAAGATAA TAA6AAAAGT GCGTCAACTG 4740 

GTTCTATGAT GGTGGTATTT TCAATCATGC 4800 

AAAATAAAAA CGAATCACAG CGACGATAAT 4860 

ATAAAAATGT GATATATAAA ATTCGCTTGT 4 920 

AGTAATTGGA ATATAACGAC AAATCCGTTT 4980 

AGCCGTTTTG TCCAAACATA TGCTAATGAA 5040 
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10 



20 



25 



AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 5400 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 5460 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

IS ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA 6AAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC tTGTCATAAT TTTCCTCCAA 5640 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTAT6TTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTK^AAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 60 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

CCGATTATAC AAATTAATTT GG6AACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

^ AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 480 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 540 

55 



40 



45 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCOTTTTGIT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

5 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 780 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 840 

,^ TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 900 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 1140 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

20 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 1320 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 1380 

25 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTGCATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

^ ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 1740 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 18 OO 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

40 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

6AAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2040 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TGATGCATAG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2340 
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10 



20 



25 



TATGTAGAAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 2460 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2520 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2580 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAQCT 264 0 

TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2700 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2760 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2820 

AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 2940 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACX3G AATTTTTCGG AACACACATT GAAATCATAT CAAGATGACT 3180 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 3240 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 3300 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT GATGAGAACA 3360 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 3420 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 3480 

TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 3540 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAQ^JUUUSA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 3660 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATAT7GTT AAACGAACAG 3780 

45 CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 3840 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3900 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3960 

CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATGCAA CAACAATTTA 4020 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA C6CTTGGTCA 4080 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 4140 
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ATTACAACAG TTTAGTGGTA ACTTAGAAAG AGCTGCTGTT GAATTGGCAC AAGAATGGCX5 4260 

AGGCGATAAA CAATTACGTC AATTAGAAGC TATGCTAATT GTAATGGATA AAGATGCTAT 4320 

5 

TTTAGTTGTC AGTGGAACTG GCGAAGTTAT TGCTCCAGAT GATGACCTTA TCGCTATTGG 4380 

ATCAGGAGGC AACTACGCAT TAAGCGCAGG ACGTGCATTG AAACGCCATG CATCGCATTT 4440 

GTCTGCTGAA GAAATGGCAT ATGAGAGCTT GAAAGTAGCG GCTGATATTT GTGTCTTTAC 4500 

CAACGATAAT ATTGTTGTCG AAACACTATA ATAATCAGAG CACGATAAAT AATTACGAGC 4560 

AATTAATTTT AGTTAAAAGA CGGAGGAATG AAATTAATGG ATACAGCTGG AATAAGATTA 4620 

ACTCCAAAAG AAATCGTATC TAAATTAAAT GAATACATCG TTGGACAAAA TGATGCTAAA 4680 

CGTAAAGTGG CAATTGCCCT ACGTAATCGA TACAGAAGAA GTTTATTAGA TGAGGAATCA 4740 

AAGCAAGAAA TTTCACCTAA AAATATTTTG ATGATTGGAC CAACCGGCGT TGGTAAAACT 4800 

20 

GAAATTGCAA GAAGAATGGC CAAAGTTGTC GGCGCGCCAT TTATAA7VAGT AGAAGCTACT 4860 

AAATTTACTG AGGTAGGTTA TGTAGGACGA GATGTTGA/^ GTATGGTTAG AGATCTTGTT 4920 

GATGTTTCAG TAAGATTAGT CAAGGCGCAG AAAAAATCAT TGGTACAAGA TGAAGCAACA 4980 

GCTAAGGCCA ATGAAAAACT TGTTAAGTTA TTAGTTCCAA GTATGAAAAA GAAAGCGTCT 5040 

CAAACGAATA ATCCTTTAGA GTCACTTTTC GGAGGTGCAA TTCCAAATTT CGOACAAAAT 5100 

30 AACGAAGATG AAGAAGAACC ACCTACTGAG GAAATTAAAA CAAAACGTTC TGAAATTAAG 5160 

AGACAGCTAG AAGAAGGCAA ACT7GAAAAA GAAAAGGTAA GAATTAAAGT CGAACAAGAT 5220 

CCTGGTGCTT TAGGTATGCT AGGTACAAAT CAAAATCAGC AAATGCAAGA GATGATGAAT 5280 

35 

CAATTAATGC CTAAAAAGAA AGTTGAGCGA GAAGTTGCTG TTGAGACGGC AAGGAAAATC 5340 

TTAGCTGATA GTTATGCGGA TGAACTAATT GATCAAGAAA GCGCTAACCA AGAAGCGCTT 5400 

GAA'tTAGCAG AACAAATGGG TATCATCTTT ATAGATGAAA TCGACAAAGT TGCGACGAAT 5460 

40 

AATCATAATA GTGGTCAAGA TGTCTCAAGA CAAGGTGTTC AAAGAGATAT TTTACCTATA 5520 

CTTGAAGGTA GCGTTATTCA AACCAAATAT GGTACTGTGA ATACTGAACA TATGCTGTTT 5580 

45 ATAGGTGCTG GAGCTTTCCA TGTATCTAAG CCGAGTGACT TGATACCAGA ATTGCAAGGT 5640 

CGTTTTCCGA TTAGAGTTGA ACTTGATAGT TTATCGGTAG AAGATTTTGT AAGAATTTTG 5700 

ACAGAACCAA AATTGTCATT AATTAAACAA TATGAAGCAT TGCTTCAAAC AGAAGAAGTT 5760 

SO 

ACTGTAAACT TTACCGATGA AGCAATTACT CGCTTAGCTG AGATTGCTTA TCAAGTAAAT 5820 

CAAGATACAG ACAACATTGG TGCACX5TCGA CTTCATACAA TTTTAGAAAA GATGCTAGAA 5880 

GATTTATCAT TC6AAGCACC AAGTATGCCG AATGCAGTT6 TAGATATTAC CCCACAATAT 5940 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TTATTATCTA AAACGAGAGA GTTAAACACG 6060 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 6120 

5 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 6180 

CTAAATGAAT TATTAAAAAG TCAJ^GAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 6240 

AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC 6300 

10 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 6360 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

IS OATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTGCTACAGT TATTGGTATG 6480 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CX5CGCGATAA AGCTGCTATT 6540 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CX3ATTGAACA TATCTTTGAA 6600 

20 

GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 6660 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTGCTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6780 

25 

TTAGAAAAAA GTAAAT 6796 
(2) INFORMATION FOR SEQ ID NO: 3: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



40 



45 



SO 



ATCCTAAAAT 


TnAAAATTAT CACGCCTTTT 


GaACAGCTTT GTAACCaTCt GGACGATCAT 


60 


kAAATTCCaA TGTAAATCCT GGTTTAAaGT 


TGATCTTTAA CCTTATTTAA AyCACCAATT 


120 


GTACGTATAT 


TATGTTGTTT AGCAAAATCA 


CGTTTTACAG CTAAAGCATA CGTATTGTTA 


180 


TACTTCATTG 


GTTTTAACAT AGTCATTTGA 


TATTTCTTTT CAAGACTTTG CTTAGCTTGT 


240 


TCATAAACTT 


TTTTCTCTTC TTTTGACTTC 


AATGGTTCTT TTGTTAATTC ACCTAAAACT 


300 


GTTCCAGTAA ATTCTAAATA CCCATCTATA 


TCGTCAGATT TTAAAGCATT AAATAAAAAT 


360 


GCTGTTTTGC 


CCATACCATC TTTCACTTCT 


ACAGTATTTT TGGTCTCTTC TTCTATTAAA 


420 


ATTTTATACA 


TATTTGTAAT AATCGATGGC 


TCGGAGCCAA GCTTTCCAGC TAACGTAATT 


480 


TTATCACCTT 


TTTGTGCAAA CATAGGAATA 


GCGATAGCCA GTATAATAAT CATCACTATA 


540 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATGCTGCT 720 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 840 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

GGTAACAACG CATACACTAC AAGTGCAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 1020 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 1080 

GCAGTT6CAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 1140 

AGTTGCCCCT TACGTTCACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 1380 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 144 0 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATCACCAA GTTTCATCGC 1560 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 1620 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTCTAAT GCACTAAACG GTTCATCCAT 1680 

35 TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 1740 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 1800 

AAGTBATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 1860 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 1920 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 1980 

AATATAACCT TCACTT7UVGT GAATGAGTCG ATTAATCATT TTTAATGTCG TAGTTTTTCC 2040 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2073 
(2) INFORMATION FOR SEQ ID NO: 4: 

SO (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG €0 

AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TQAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 240 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 300 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 420 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 480 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 540 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TOCCACTQTT 660 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 720 

ATGATCATCA TTTTTAAAAG ATTAGCGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 780 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 840 

ACTAGTGTTC TTTTT TA GCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 960 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

35 ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 1080 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACCAAACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 1440 

TACAACTTTA ATTAGATTAT AATCATAGTt TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

SO TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 1620 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGctAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT 1860 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 1920 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 1980 

ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 2100 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 2160 

IS AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 2220 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 2280 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 2340 

20 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 2400 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 2460 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

25 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 2580 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 2640 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACX5T ATTTGTTAAA CTTAATGATA 2760 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2820 

AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2880 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2940 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

40 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 3180 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 3240 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3300 

TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 3360 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 3420 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 3480 
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GTTTiTi'GAC 


CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT 


TTGTAATGGT 


3600 




GACCAGAATG 


GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC 


TAATTCTTCT 


3660 


5 


GAAAAACTTT 


TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC 


ATGAAGAATA 


3720 




GGACTAGGAT 


TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC 


CGGTTTTAAA 


3780 


10 


TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC 


AAATGTTTCT 


3840 


GTAAATGCCT 


CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG 


TCCAGCAACT 


3900 




AAAGTAACAT 


TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG 


GTCATAGTTG 


3960 


IS 


AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT 


TTCTTCAATC 


4020 




ACTTGGCGTA 


CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT 


CGCTACATCT 


4080 




GCACCTTCTT 


TAGCATACGC AATTGCTGCT GCACGCCCTA TTGCTGAGTC 


ACCACCTGTG 


4X40 


20 


ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTC 


GCCACAATCG 


4200 




GGTGCTGGCG 


TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT 


TTCATAATCC 


4260 


25 


GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT 


CCTTATTCGC 


4320 


TTAATGGTTA 


TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA 


CTAAAAATTA 


4380 




CTTTCTTCTT 


TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT 


ACAAATTCAA 


4440 


30 


CAAAATACTC 


AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA 


GGATGATGTA 


4500 




AATCATATTC 


ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT 


TGACTATAAC 


4560 




CTGAAAAATC 


TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA 


TATCCAACAT 


4620 


35 


GTTGGGCTAC 


TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA 


GCGCCAGGTA 


4680 




AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG 


TCCGCAATAT 


4740 




CTTGJAATCG 


TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC 


ACTGTGCCTT 


4800 


40 


GTACATACGC 


ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT 


TGTCCAATTG 


4860 




TTACTACCGC 


TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA 


ATACTATTAA 


4920 


45 


TCAATTGCGC 


CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA 


GCATGACCAC 


4980 


CCACGCCTTT 


AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC 


CCTGTTTTGA 


5040 




TTGCAAATGT 


ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT 


GCTTGTACAT 


5100 


SO 


CTTTTAATGC 


ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT 


TCTGCTGATT 


5160 




GAAAAATGAA 


TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT 


TTTACAGCTG 


5220 




TAGCCAAAAT 


ACTAGCCATG TGAATATCAT GACCACACGC ATGCATAACA 


CCTTCATTTT 


5280 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 5400 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 5460 

5 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 5520 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5580 

TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 5640 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 5700 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 5760 

IS GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5880 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 5940 

20 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 6000 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

25 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 

30 AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 6300 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 6360 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 6420 

^ CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 6480 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCTAAATCAT 6540 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6600 

40 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AATCATAGTA AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 6720 

4S ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 6780 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6840 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6900 

CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6960 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7020 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTATCAATCA TTGGAATATG 7080 
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AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTTCCAAT CTTCTTTCAG CACGCTCTGA 7200 

TCCATCTGCT ACAACAACCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 7260 

5 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 7320 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 7380 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCGTC 7440 

10 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 7500 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 7560 

IS TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCQCACGCT CGATATCTTT 7620 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 7680 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 7740 

20 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTCAAG 7800 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7860 

TTTTTCACGC AATACTTTCG CTTCTTCTAC AGAGTATCCT TGTGGCACAT ATCCATTTAG 7920 

2S 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 7980 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTC 8040 

30 TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCT6TTTTAA CATCACAGTA 8100 

TTTCGTATCA ATTCGCTTAT CAACACGTGT TTCATCAACA TCCACX3CAAA TTGCTACCCC 8160 

ATGATTCATA GTAATTGCTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 8220 

TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CT^TGTCTC 8280 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 8340 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 8400 

40 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 8460 

AGCAACTGGC TTTCCTGATT GTACTAACAT TGTCTCATCT GATTCTAATT CTCGTAACGT 8520 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTCCAATAC CACCATAAAC 8580 

45 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATTCTAAG 8640 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8700 

so TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8760 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AGCCAACAAA TATAAATAAA 8820 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 8880 
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CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 9000 

GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT ACAGTTTTAT 9060 

5 

CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT GTATGATTTA 9120 

AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 9180 

,^ GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 9240 

ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA TTTAGCTATA 9300 

ACAAATGAAA AAATAACCCA CXSAAGATATA AGATCCATTC CTTTATATGA GGAATCTTAC 9360 

IS ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT TG7U\AATTTG 9420 

CCACTCATAT TACCAAACAA AAATTCTCAA GTGCGCAAAC ACTTAGATGA CTATTTTAAT 9480 

AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 9540 

20 

TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA ATCATTTCAC 9600 

ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 9660 

TACCATAAAA AACX5CAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 9720 

TATTTATATG GACTTTTAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9780 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9840 

30 CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9900 

CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9960 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 10020 

CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 10080 

CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA AAATAATTTA 10140 

TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATGCTT 10200 

40 

TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 10260 

GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GGCCTATGGA TTGCTTTAAA 10320 

45 TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 10380 

TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 10440 

TTTAGAAGGA AGAGTTAGAG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 10500 

SO 

TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 10560 

GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT ATAAAAAGGC CTCTTGAACT 10620 

CCGTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA TACAGTTGTT 10680 
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TTACTGCAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10800 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATAT TTTTAATTAT TTTAAAAAAT 10860 

5 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTCTATTGTC AGCGTCATAT GTTGGATTCA 10920 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAAT AATACGTTTT GCTAATTCAA 10980 

GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

10 

TATCAATGAC ATCCATACAA ATCGTAAACA TAAOXSACATC ATGTTCATGT ACAAAACGTT 11100 

CAATCATATC TTTAATTGTT GGTGATACGT GACTCAATAA TTCATCTGCA AAGACATAAT 11160 

15 CAATCTTTTT CTCTTTAGCA TAATCAAATA AACTTTGCGT ATTACCACCT TGAGCAATAC 11220 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTCCAG 11280 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 11340 

20 

ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AATATTGAGC ATACGCAATA TCATGTCCAC 11400 

CACCTAATAA AAATGTTTGT CTATGATTAG CAATTGACTT CGCTGCAAGC ATAGCAAATT 11460 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTCCGTAA TCGACTAAAG 11520 

25 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAATCGCAT CTGGTCCTTC 11580 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACACCTTTG TCAACAGCAT AGCCTAATAT 11640 

30 ACCGACCCCT GATGGCATAC TACTCTTTTC CAGCTTAGAC AAATCTTCAA ATGTTACTGT 11700 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG TCCATAAATT 11760 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGGCTT ATTAACAATT 11820 

AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA TCTTAAAAAT 11880 

ACTTATCTAC ATTACTTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA ATACGTCAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12000 

40 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATAT7 TTGCTTGTTA CCTAATGGAG 12060 

GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTTCGT 12120 

45 CTTCTTTGTC ATTACAATGG CGTTATCCAT TAT6CTCAGA GATTTCCAGT CTATAATTGG 12180 

TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAATCATTTG 12240 

TATACTCGTT TTCAAATATA AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT 12300 

GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG 12360 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGGCTTATC 12420 

AGTACCTATT ACACACATTC TGATTGGACA TATTCTGATG GCGTTCGTAG TAGAATTCGG 12480 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT TAATTAGAGC 12660 

5 

GACTAAAGGA CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 12720 

TATTTTCTTG TTTAGCGAAG AAATCGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 12780 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATCCGAG GTATTGCATA 12840 

10 

TTTAACAACA AGACGAAACC TTGAAGAACT TGAGCCTAAT AATTATTTAG ACCATGTCAA 12900 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 12960 

IS TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACT6AA AAAACAGAAC CTCAACATCA 13080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 13140 

20 

AGAATCAGCm GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 13260 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13320 

25 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGTCTTCTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAGCATTT 60 

40 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 120 

ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

45 AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 240 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 300 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 480 

AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 540 
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10 



20 



25 



AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 660 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 720 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 780 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 840 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 900 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 960 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AACAATTTCT 1020 

IS TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 1080 

TTGTCAAATT CTAAAATTAT AGCCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 1140 

GTACGAAGCA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT TAAATTCATC 1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 1320 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 1380 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAGCATAA 1440 

TACGATATCC CACCAGATAA AATAGATGaT ATTATCGGTA TGTATATATC ACCTTTCATA 1500 

30 TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ACTTGCGCAA AAAGATGTAG 1560 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG GCAAGGAAGA AGTCTCCTGC 1680 

GGGACCAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 1740 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACCCAGTGAC 1860 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 1920 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

45 TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2040 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCQTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 2160 

CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 2460 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2520 

5 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2580 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 2640 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2700 

10 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAATCTTTAT 2820 

15 TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2880 

CACCATATAA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2940 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3000 

20 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTOCTTGC TTTTTGGCAG 3060 

TTGCTTGCTT AATAACGCTT TTAGCTTTAT CTCCAACACT TACTTTATCT GGGAAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

25 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 3240 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCXK5TAA AAAGCTATCA TAGTTTTTAA 3360 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 3480 

GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3540 



40 



45 



SO 



CTGCTTGGTT 


TTTTGTTATC 


AACATTGGTC AACACCTACC CTAAATCATT 


TGTGTCGTTC 


3600 


ATA-M^CGTAG 


GTGTCATTAC 


TTCTTTAATT GGCGCTTGCC CTGTTGCTTT 


TCTATACTTG 


3660 


TTTTCAGCTT 


TATATTTCTT 


TAGCTTTTGA TTTGCCCATT TACCTTCTTG 


AGATGTTGGA 


3720 


TTATCTTTAT ATGTAGTATA 


TAAAGCAACA ACTGTTAAGA TAATCGATGA 


AACACTTTCT 


3780 


TCATCTACTG 


GTATCGGACT 


TATACCTTTA TTCGCTAAAA ACTGATTGAC 


TAATGCTAAG 


3840 


ATCAATACGA TGTATCTTGT 


TATTACTTTT GCATCCATTT GTTTGCTCCT 


TTTATCCAAA 


3900 


ATAAAAAGCC 


AGTGCCGAAG 


CACTGACTCT TAACTATTAC TTACACTTAC 


TAAACCAGAA 


3960 


ACACGACCAA 


AAGCTATATC 


CTAAAATTCC CTTAAGCATG GTAATCACCT 


CCTTTAAATG 


4020 


CCAAAAATAG 


TTTTTAACAA 


GGCTATAACA AATGTACTTA GAATCGTCCC 


TATTAATCCT 


4080 


AGAATCCACA 


TCTTGATGTC 


TCTAATATTT TTAGCATTTT TCTCTTTATT 


TTTTTCATCT 


4140 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 4260 

TCTAATCGCG TTAAACGCCA ATCTTGTTC6 TGTCGTTTGG TAAATCCAAA CATTACACCA 4320 

5 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4380 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 4440 

CACGCTATAT CTTCTTTACT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4500 

10 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4560 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 4620 

IS TGTACTTTGA ACAACTTCTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4680 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4740 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4800 

20 

TTGAGAGAAA TTTTCTGGTA AATTTTCTIAT ATC7UVTACCT TCTTCAAAGC CACCAATGAT 4860 

AGCGTATGAA ATTATCTCAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4920 

ATAATTTTGT TAATTGTCCC TCTATTTGCX3 TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4980 

2S 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 5040 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

30 TTTCCTCCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 5220 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ACTTGTCATC TGCGTATAAT 5280 

TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT TTCAAAAGGT 5340 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 5400 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 5460 

40 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC 5520 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

^ CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 5640 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 5700 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 5760 

SO AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 5820 

ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5880 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC 6CACGATCCA TAATAGTTTT TTCTAATTCC 5940 
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10 



20 



25 



TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT QACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6240 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 6300 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 6360 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

IS TACTATCATA ATTTTTCAAT AGTATCTCAO CAGATQCTQT AACACTATTA CGAACTAGCC 6480 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 6540 

TACX5TTCAGT TGGAAAATCT 6TAAAAC6TT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 6660 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6780 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6900 

30 TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6960 

CTGTGTCACT CaTGATAAAA GGAACX3CCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 7020 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaXCACC ACCGACAGTA ACACCTAGCA 7080 

TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 7140 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 7200 

TCTCeTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 7260 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7380 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 7440 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7560 

^0 TTAGGCGTAT ACTTGAAACG AACTT^TGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 7680 

ACAATCAATG AGCTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CATATAGCCA 7740 
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ATTACTGCAT TTGTAAgAGG TGCAAGTTCT GTCACAAATA AAAATTCTTG CTTATCAGGT 7860 

TCAAAACGAT ACTCGATATC AAGAATTTCT TGTTTGGTCT TATTTAATTC TCTTATAGTT 7920 

5 

TCCTCTTTAT TAATTTGAGT TTTGGTTTCC CAATCGTCTA AATGTTCTTT TAATQTGTCA 7980 

AAGGTTTCGC CGTTTACATT AACTCGAGCT TGAACAATCT CATTAGCACT GTTATTACGT 8040 

GGTGCCACAA CAAGTGCGTT AATTTGACTT TGTAAAGATT TGTTTACTGC TGCTTGCGAT 8100 

10 

CTACCATTAT AATAAATTTG CTCAGCGAAG TGTTGAATTG TTTTAGCTyT CTGATGCAAC 8160 

TTAAACTCTG TTGTCAAGCC AAGCGCAAAT TGCTCTATTC TTTGTAAGTT TTGTATTTCC 8220 

IS TTAGCTCTAT AATCTC6ACC TGCTAAAGCT CCCAAATCCT TTATTAAATA CAAATTTTCC 8280 

ATAATGCACC TTCCTTTCTA ATAAAATAGC ACTGTACCAA GTTTCCCACT ATCGTCAACT 8340 

GTTATTTTCC ACAATTTACC GTTTGGGGAT TTCTGTACAA TGCTATTTTG AATAATTgcC 8400 

20 

TGCtTCGCCT ATTTTTAAAT TATCTAATTT ATTTkTATCA TTTACCGAAA TGATACCGTC 84 60 

TTGAGGCAAT CCATCAATAn CACTACTGCC TGCATAAGGT ATCCCATTTA TAGCTTTCCA 8520 

ATGTGTAGCT GGAAAGTACT GTTTATCGT 8549 

2S 

12) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT ATGTATCCGA ATGATTTATT GAGACCAAAA 60 

AGGCATTAAA GTCCATTGAA ATATCnGGTA GCGtnGTTGGT ACgTGGACGT GGGGGCCCTA 120 

GATGTATGAG TCAACCATTA TTCAGAGAGG ACATTTAACG TAATAAATTA TAGAmACGAG 180 

GGTGAAAATA ATGACAGAAA TTCAAAAACC GTATGATTTA AAAGGCAGAT CATTATTAAA 240 

AGAAAGTGAT TTTACCAAAG CAGAATTCGA AGGACTTATT 6ATTTTGCAA TTACATTAAA 300 

AGAGTATAAG AAAAACGGTA TTAAGCATCA CTACTTATCT GGAAAAAATA TTGCACTACT 360 

ATTCGAAAAG AATTCGACGA GAACGCGTGC TGCGTTTACA GTTGCGTCTA TTGATTTAGG 420 

TGCGCATCCA GAATTTTTAG GAAAAAATGA TATTCAATTA GGCAAAAAAG AATCTGTAGA 480 

GGATACTGCG AAAGTATTAG GTAGAATGTT CGATGGTATT GAATTCCGTG GTTTTTCACA 540 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC TGGTGTACCG GTGTGGAATG GATTAACAGA 600 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 780 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT .840 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 960 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 1020 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 1080 

IS TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 1260 

CCTCATATTG GTATT7VAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 1380 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 1440 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 1500 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 

30 AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 1680 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 1740 

^ GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 1800 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 1980 

nAAAATGAGA AAGATGTAGG TAGTGCCACX3 GTTATAGGAC TTATATCAGT TTTAATTATC 2040 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCGCAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG ATCTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 2220 

5^ GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 2280 

AATAAAAATG GAGCACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 2340 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 2400 
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CGACAGCAAG 


CAACTACTAA 


ACAATGGACG 


ATTGGTATCA 


TAGCCTCAAT 


TTATGCTATA 


2520 


5 


TGGCTTATAT 


ATGCAGCAGG 


TATCAATTAC 


TTATTATTGA 


CGATGTTACT 


TTATATTCCA 


2580 




GCTCTTCTTG 


TTTATACaAT 


CGkTCmAAAG 


rATwATCAGa 


CACGTTTGAT 


TAAATCAGrC 


2640 




TATATTCtTT 


TTATGATTAT 


tATCGTACTT 


GCAGTTATCG 


GGTTAATTAA 


GTTATTGATG 


2700 


10 


GGAACGATAA 


ATGTTTTTTA 


AAAGGAGCGA 


CAAAAATATG 


AAAGAGAAAA 


TTGTCATTGC 


2760 




ATTAGGCGGT 


AATGCGATAC 


AGACAACAGA 


AGCAACAGCT 


GAAGCACAAC 


AAACAGCTAT 


2820 




TAGATGTGCG 


ATGCAAAACC 


TTAAACCTTT 


ATTTGATTCA 


CCAGCGCGTA 


TTGTCATTTC 


2880 


IS 


ACATGGTAAT 


GGTCCACAAA 


TTGGAAGTTT 


ATTAATCCAA 


CAAGCTAAAT 


CGAACAGTGA 


2940 




CACAACGCCG 


GCAATGCCAT 


TGGATACTTG 


TGGTGCAATG 


TCACAGGGTA 


T6ATAGGCTA 


3000 


20 


TTGGTTGGAA 


ACTGAAATCA 


ATCGCATTTT 


AACTGAAATG 


AATAGTGATA 


GAACTGTAGG 


3060 


CACAATCGTT 


ACACGTGTGG 


AAGTAGATAA 


AGATGATCCA 


CGATTTGATa 


ACCCAACTT^ 


3120 




AcCAaTTGGT 


CCTTrTTATA 


CGAAAGAAGA 


AGTTGAAGAA 


TTACAAAAAG 


AACAGCCAGA 


3180 


25 


CTCAGTCTTT 


aAAGAAGATG 


CAGGACGTGG 


TTATAGAAAA 


GTAGTTGcGT 


CACCACTACC 


3240 




TC6LATCTATA 


CTAGAACACC 


AGTTAATTCG 


AACTTTAGCA 


GACGGTAAAA 


ATATTGTCAT 


3300 




TGCATGCGGT 


GGTGGCGGTA 


TTCCAGTTAT 


AAAAAAAGAA 


AATACCTATG 


AAGGTGTTGA 


3360 


30 


AGCGGTTATA 


GATAAAGATT 


TTGCTAGTGA 


GAAATTAGCA ACGCTGATTG 


AAGCAGATAC 


3420 




CTTAATGATT 


CTTACGAATG 


TAGAAAATGT 


ATTTATTAAC 


TTTAATGAAC 


CTAATCAACA 


3480 




ACAAATCGAT 


GATATTGATG 


TAGCAACACT 


GAAAAAAtAC 


GCGGCACAAG 


GTAAGTTTGT 


3540 


35 


GGAAGGATCG 


tGTTGCCAAA 


AATAGAAGCT 


GCGtACgtTT GTTGAaAGtG GGGaAACCAA 


3600 



A 3601 
(2) ."information for SEQ id NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

SO 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 60 
AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 120 
AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

^ GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 480 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 540 

10 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

,5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

25 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATtnTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

30 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 300 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 360 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 420 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 480 

TGACSTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

45 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

so X ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC ATACCATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGITAA ATTCTGAAGT 1140 
TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

5 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

IS 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

20 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 
ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG IBO 
ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 240 

2S 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 300 
TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 360 

30 TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 420 
CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 480 
TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 540 

^ TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 600 
AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 
TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

40 

TACCTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 
AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 
TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

45 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 
GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 
^0 AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 
CAATAAGAAA 1090 
(2) INFORMATION FOR SEQ ID NO: 10: 
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IS 



20 



(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

^0 TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 60 

ATATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 120 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 180 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 240 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 300 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 360 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 420 

AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 480 

25 AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 540 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 600 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 660 

AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAGCATC 720 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 780 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 840 

TTATGAAGAA GGGAtCaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 900 
GACG . 904 
(2) INFORMATION FOR SEQ ID NO: 11: 
' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
45 (D) TOPOLOGY: linear 



30 



35 



SO 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 60 

AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 120 

TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT 

GTCCTATAAT ATCACTGACA TTGTCAAAAT 

5 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT 

GTGACGCACC GTGACGCATC ATTTTGATTG 

CTATAATTAA AAATTCACCA TTTGTTTGCT 

10 

GTTCAAATAA TGGACGACCA CTCAAACCGC 
CATCTCX3TTG TCGCGTTGTG TTTGCTAAGA 
,5 GTAATAGTGC TTTTAAGCCA TCGAAATCCA 
CTGTTACATC ATGTTGTTTT TTAAATGCTG 
CTTTATCATG GAAGTTTTGA AGATTTTCA6 
ATGAAACGTC GTGTTTAAAC GTATCAATAA 
AAGGTGTCAT TTTATTCACA CCAACATTGA 
GCAAATGACT TAGTGCTTTG TTCATACCAA 

25 

CGTCATCTTC TAATAATCTA AACATGCGTG 
TGATACCACC TAATTCTAAA GCACCGAATC 
AAGATTTGTC GAAACCAGCT GCTAAgCCAA 
TTTGTGATAA CGTTGGATTC TTATAAGTAA 
GaAACTTTTG TaACGTTTTT AATGCATCGA 
35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT 
GAGGCTTACT ATCCTCAACT TAATATAT6T 
CCAT&CATAA TTTCCTAGTT AAAACTAAAA 

40 

CGTTTl'TAAG ATTAAATCAT CCTAATTAGG 
AAGGTGTTTG TATGAATGAA CAATGGTTAG 
TTTCACCAGT GAGTGGTGGT GATGTAAACG 

4S 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA 
GTTTAAATGA ATTTGAACGT GCAGGTATCA 
5^ TTAACGGTGA TGCGTATTTA GTGATGACGT 
GCCAATTAGG GCAACTCGTA GCTCAATTAC 
GCTTCTCATT ACCTTATGAA GGTGGCGATA 

55 



ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GATGATCTTT TAAGTAACGT GCGATGCCTT 360 

CAATAACAAG TGATGAATAA ATTTGAATAA 420 

CATCTTCAGT ACTGAATACX5 CCGCCTGTAC 480 

GATAAgCATa CTTAATCAAT TTTAAATTAC 540 

CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 600 

TGATACCGTC AAATGTCTCA GTAATCGCTG 660 

TATCAGACGT TAGTTTTAAA TAAATTGGCA 720 

TTAAAGCTTG GCATAACATT GAAAATTCAT 780 

TATTTGGAGA ACTGATGTTG ACTGTGAAAA 840 

CCTTTATATA ATCTTGATAA CX5CGCTTCAT 900 

TACCAACAGG TACTTGATAA GCATTTTTAC 960 

TATTATTGAA GCCCATTCGA TTTATCAAGG 1020 

GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

CAAGGTGTTC CAATGCTTTT GGTACTTCGC 1140 

TTGGATTGTC GTACGTATTA CCTTGTATCG 1200 

ATAGTTTATC GACGACTGGG AATAAAACCG 1260 

TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 1320 

ACATGAGTAT GCTCCTATTT CATTATATTT 1380 

GAAATATATT CTTTTAATAG ACTAGCATTT 1440 

AGTTTTGAAA ATTGACGCAA gTTTGAATAA 1500 

CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AGCATTTACC TTTAAAAGAT ATTAAAGAGA 1620 

AAGCATATCG AGTCGAAACA GATACGGATA 1680 

AAGAATCATT TTATGCTGCA GAAATTGCAG 1740 

CGGCACCTAG AGTAATTGCA AGTGGCGAGG 1800 

ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 1860 

ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

TTTCTTTTGA TAATCATTGG CAAGACGATT 1980 



244 



EP 0 786 519 A2 



GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT 6CGACX3TCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2220 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTATCGTT 2340 

10 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 2400 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 2460 

IS ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2520 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2640 

AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 2760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2820 

2S 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 28 BO 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 2940 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACT6CG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTG6TAAT GAAATTGGTA AAAATTAAAA 3060 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

3S GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 3180 

GATGCGCTTA ATCATTCOAT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 3240 

AGC;^TATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 3300 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3360 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 3420 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 3480 

45 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 3540 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

SO ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 3660 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3720 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3780 

SS 
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ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3960 

5 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4020 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGGCTA 4080 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 4140 

70 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4200 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4260 

IS CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4320 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 4380 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 4440 

TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4560 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 4620 

25 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 4680 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 4740 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4800 

30 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4860 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4920 

3$ TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 49^80 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 5040 

TACfiXSCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 

40 

ATTATCAAAA AAATCAAATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 5280 

45 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 5340 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 5400 

ACCTAAAATA AGTGCGACAT TTTTAGCCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTAT6T6T CAGCATGGAT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACOATATA TTACAOCATA TATTATTTGG 5580 
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10 



20 



25 



ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5700 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 5760 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG .5820 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 5880 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 5940 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6000 

CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 6060 

15 CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACX3CCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTATCGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 6240 

TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 6300 

GGAATGACAG CAACACAGTT ACTACTACTT 6TGTTTTTTA GTTCAACATT TACAGCGTGC 6360 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 6420 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 6480 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6540 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 6660 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6720 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 6780 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6840 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6900 

TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7080 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 7140 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 

so CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7260 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 7320 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 7380 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA 
GAGAATGGTT AACATGCCTT CATGTATAAT 
^ TAAAAATAAG CTTGGTCAGC CATCAAATAT 
GAGAACAATC AATTAACCCC ACATATTTAA 
AATATAACCT AAGTGACCGC CTGGAATATC 

10 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC 
TTTATCGCTG TATTTTGTGA AATCATCCAA 

,5 TTCAAATTCT GACCAGAACA TCGTACGTTT 
AGCAGGTTGA GACATCATTT TTGCATCAAT 
TTTCATGCCT TTTTCTAAGC CTTCTGTTAA 

20 TTTCCAATAA GTACTGTCTG GTAAAAATGT 
TTTAACGACT TCAGGGTAAT CTTTTAACAC 
ACCTAATATA TAGACAGGTT CATCACTTAA 

25 

GTCGCGTTTG ACACGATAAT CACTGTCAGG 
TAACTCGCTT TCTCCATAAT CACGACGATC 
CTGTTCTGCA AGAGGCAGAA AAATGTCTCC 

30 

CACGGGTCCT TGTCCGACTT GGTGGTATCG 
CATATTCAAT GACCTCCATT TGTTAATTGT 
35 TTGTATAACT TATTTTCTCT TTTTCTTCAT 
CTAATTTTTC AGGCTCAATA TATGGATAAT 
CTTCTTTCTT gactaaatca aactgtggct 

AATTATTTTT AAAGTAATAG CTTACAGGGT 
CCATACGTTC TAAGAAGAAT GGGATAAACT 

TTGGATAACG atcaaaaata ccagataata 

45 

tgtgccaacc ataaccaaaa caagcaaatg 

TATAGTATGA TTGATAAATG TCACTGTTAA 
CTAAATTTTC AGCTGTTTTG AAAATAATGT 
GTGCACGTCC CATAATGAGC GCACCTTTGA 
CTCXSCGCTGC GGCTTCAGGC TCATTGATAG 
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ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7500 

AACGAGTTGA TTTGAACGTT TAAGCX5TAAA 7560 

AATTTGAAAA CTGTCCAAGC TGTTTTATTA 7620 

TAATACATCA GCAAAGCCTT CAGGTTTTTG 7680 

TACAATAGGT ATGCCAGTTT CTTTATTTAT 7740 

TCTAGAATCT GTCCCATTTA GTAGGGTGAT 7800 

AGTAATATCT GAATGCGTAT ATTGTCTAAT 7860 

GTACTGTTCT ATACGTCCTT CTTCAGTATC 7920 

TGGTGCGATA TTTAATGTTT CGCCAAATGT 7980 

AATTTGATGC ACAATGTCAT CATTTTTATC 8040 

ATTAATTGGT GGTTCGTGAA ATGCAATCTT 8100 

ATGCATCGCA ACGATTGAAC CTGAACTTGA 8160 

TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 8220 

GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 8280 

AACGGCTACA ACAGTAAAAT GGTCTTTTAA 8340 

GGTACCGTTT GCACCAGGAA TAAAGATGAG 8400 

TAATTTAGCG CCTTGTAATT CTAAAGTTTC 8460 

TAGGTGATAA ACCTAATAAT TTAGCACCAT 8520 

CTGTTTU^CC CAGTTCATCT AAAAATACAC 8580 

CAGCAGCATA AAGAATTCTA TCAATACCTA 8640 

TCGTTAACAT GCCACTCGGT GTGATATAAA 8700 

GGTTCAAATG TTCAGCGAAT AAAGCTTCAT 8760 

CACCCCAATG TCCAATAATC ATATTTAACT 8820 

CTAGATGTAT TGTATGAATG CCGACATCAA 8880 

TTGCCGCAGT TACTTCAGGA TAATTTCCTT 8940 

CTGGCGCGGG ATGTAGATAA ATCGGTACGT 9000 

CATATTTGTC TTGATCAAGA AAACCATCTT 9060 

ATCCTAAATC ATTGATGCAA CGTTCGAATT 9120 

GTAAAGTTGC AAAGCCTACA AAGCGATTGG 9180 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 9300 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATTCGTCG GCATTTGTAA 9360 

5 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 9420 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 9480 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 9540 

10 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 9600 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTGCTAAT 9660 

IS TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 9720 

AAATCAGTAT TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 9780 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 9840 

20 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 9900 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 9960 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 10020 

25 

GAAGGTGCTG AAAATTGACC AGATATACCT GCATTGTTAA TGAGGATATC AAGACGGCCT 10080 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 10140 

TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 10200 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCCTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

35 GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCGCTTTGG 10500 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT 6ACGTAGTAG 10680 

45 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

SO ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 10860 

AATTCTCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 
GTATTAGCTA TGACMCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 
TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 
GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 

(2) INFORMATION FOR SEQ ID NO: 12: 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(0) TOPOLOGY: linear 



11100 
11160 
11220 
11271 



20 



2S 



30 



35 



40 



45 



50 



SS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CAACCCGTTC AGAACAAAAT AAAAACCGTA CAATTTTATC ATCTTAATGA TTATTGTACG 60 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAATCGA TATCGGTAAA TTTATTATAT 120 

TGTTTCATAA AATGTAACTT AACTGTGCCT GTTGGACCGT TACGTTGCTT AGCAATGATA 180 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 240 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 300 

TCTTGCTCAA TCGAACCAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 360 

TGTTCAACAC CACGAGATAA CrGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 420 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTCCTGTT GTCTGTTATC GGACGCACGT 480 

GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 540 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 660 

TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 780 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 840 

CGCGCTGCAA GGATAATTAA ATCATTTCGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 1020 

GATAGCTCTA AAATTCQACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 1080 

TTATATCCAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 1140 
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TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 1260 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACX3GAAA 1320 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 1380 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 1440 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

10 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 1560 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

IS AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTOTAC TTACTQACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 1740 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

on 

CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 1920 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 2040 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2220 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 2280 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 2340 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 2400 

GATGCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 2460 

40 

GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2580 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 2640 

4$ 

TCTAAAACCA GTTCCGGTTT ATGCX3TATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 2880 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2940 
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CCATAGAAAC GCACATTACC ATTAATACTT 
AATGCTAAGT CTAGGCCTGA TTGTGATAAT 
^ TCACCAACAC CGATACTTAA TGTTAATTGG 

TGACTCAAGA TATCAAATTT AGATTCTTCT 
GCTACGAATT GATCGGAACT GTATCTTTTG 

10 

CTAATGACAC GCGTTACCAT TGAGTTGATT 
GTAATCTCAT CGTAGTTATC TAAAAATAAT 
AGTTCATTTG TTTGTACTTG TTCAGTTATA 
GAATAACGTA CTTGGAAATG ATACTGATTA 
AATTGCTTTA AAATGTTTGG AAATACTTCA 
20 ATATGATCTG TCATAAATTG GTTAACCCAT 
ATACCAATTG GTAAATGTTT GATTGCTTTA 
CCATCTACAT AACTATCCAT TTTCATTAAA 
ATCATCACGA CAAGAACGAT AGATGCAATT 
ACACCCATTA AAACAATTGC TGTGATGATC 
TTAGTGGACT GCCGATTCAT TATTCCACCT 

30 

TTCGCTTCAA ATTCAAACTT AAATCGATAA 
GTGTCAGTAT TGTACCGATA ACCAATAGTA 
CTTTACCAAA GAAATGAATA ACACTTAAAC 
GTTGGAAGTT TAAAAGAATG CTCTGGAACA 
TGA-BAACAAT AATGTATATC CATAATAAAA 
40 TAAATACAGG TGTAGCGATT TTAAATTTTC 
TTAAGACGAT TAAAAATGTA ATGATAATGA 

TAAACCCTTC TTCTAATATT TGGGTCATAT 

45 

CATGTAATGT TTGCTTGAAA GGTTTTACTA 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA 

SO 

AGACTAATAT GATGGCACTT AAAACGAAAG 
TAATAAGTGC ACTAATCCCG AAAGATTGTA 

55 



TTAATTGCAA CTTGGTCGCC ACC3GCGTCCT 3060 

TCACCTAAGT CGATTAAATT TTCAGTACCT 3120 

GCACGATAAC CAACACTTTT TTCACGTAAT 3180 

AAGTCAGCTA ATATTTTTTG ATTTAAATAG 3240 

AAAAATATAT TATACTCAGT TGCCCATCGA 3300 

TCCGAACGCT GCGTATCATT CATATTTTGC 3360 

GTCGCAATGA TTGGTTTAGA ATTTTCATAT 3420 

TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 3480 

TATTCTATTT cAACGGATTT CACTCTATCT 3540 

TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

TCGATGTGAT CATTTTCATC TAAAAGAATG 3660 

TTATTTGTTG TTGAAATTTG AGCACTCAAA 3720 

GCTTGTCTGA ATAAAATGAT GCTAACAATA 3780 

AGTGCTATAA GACTATTAAA GATAAACCAT 3840 

ATGATGACAA ATGGTATTAG TAAAGCTTTC 3900 

CTATTCACTT TTTAGAATTA TTTTTCATGA 3960 

CACCAAGTAG TCCTACAATA TGTGTCGTAG 4020 

AAATCGTTAC TGCATTCGGC AAACCTTTCG 4080 

CTTGAATATA CATTACTAAT GATAACACAA 4140 

CACTCGGTTG ACCTGTAAAT AATAAACATA 4200 

TACCGCTCAT TTGCCACGCG AAAAGTG6CT 4260 

GTAAAATCGG AAATGTAACG ATTAAGTTAA 4320 

TGAAACCTGG TAATTGAACG GTCGCTTGTC 4380 

TCGCATCGGC ACCGCTCATC GTAATCGCTT 4440 

TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4500 

TTnArCTCAT CGCTACTGTT GTTACGTATA 4560 

GCAATTGACC AATAATTAAA CTTGCAATTA 4620 

TATTACCTAA AACAGTTGTT ATAATTACTG 4680 

TTGATTTATT CCATAAAACG ATACCTGGTA 4740 
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CAAATACCAA CGCAATCGTT GCAATTATTG TTGCTTTAGG TTGTATTTTT GAAAACACAT 4860 

AAGCCACTCC CATATTTTTA ACTATAGCTA TTATTTTAAC CTCTTTAATG AAAATTAACA 4920 

ATTTATAGAT TGTATGCTTC TATTTCATTT AATTGAATAA TAACTTTCAT GTTTTATAAG 4980 

TAATTAACAT ACTCATTT6A ATCGCTTTTG TGTGCTTTCA TTTTCAACAT GATTATTTAA 5040 

TCCCACTACA TAGCAATCAA GCTTGATTTA GATTTACAAT ACATTTCCAC TCTCATGTAC 5100 

TCTAGATGTT TTTGAATATG ATAACTGTGA TTTAGTGGCT TCATTCTTTG AAAATATATA 5160 

TTATTACTTA CGCTTAAAAT GCTTTAAATT TAAGAAATGA TATAAGTTAG GTGCCCAGGT 5220 

ACTAAAGTTT AGTAGGaATC CATCATGCCC AACATTATCA GGCACGAAGA AATGACGATG 5280 

ATATTTAAAA CGTTCACCTA ATGCACGAAC TTGATCATCC GGATATAGCA AATCATCTAT 5340 

GAACCCCATC GTTAACACTT TTGTTTCTAA ATTTTTAAAA ACATGCGTTA CGTCTGTGCG 5400 

20 ACCTCGGTCA ATGTTGTGAC TATCCAATAC ATCTAGCAGT GTCAGATAAC AATTCAAATC 5460 

AAAATGTTCT TTAAATTTAT TACCTTGATG TTGTTGGTAT GCGACTACTT CATCCGGCGT 5520 

AAAACGTTCA TCATAACTTT TTGATGATCG ATATGTCAAA AAACCTAATT GGCGTGCAAT 5580 

ACTTAGACCT TCCTTACCAC CAAGATGAAT GGCTTGCCTT GCAATTTCAT TGAAAGCTCT 5640 

ACTATAAGAT GATGTTCGAC TTGTTGCAGC AAGGATAATG GCTTTATCTA CTTCAAACTG 5700 

TTGATTGTAG AGTAGTTCCA TTGCTTGCAT ACCTCCAAGA CTTCCCCCTA TTAAAATATT 5760 

AATCTTATCA TAACCAAGGG CTTGTATACC TCGTTCATTC GCTCTGACTA TATCTCTTAA 5820 

TGTTAATTTT TTAGGAAAAT GAGGGTCGTT TAAAGGTGAA CTTGAACCGA AAGGACTACC 5880 

AATAACATCA AATGTTAAAA ATTGATAATC GTGAATGGGT ATATATCCCC CATCAATAAT 5940 

TTCTCGCCAC CAACCCGGAT AATCATCTGT TCCATATGTT AAATGATTGC CAGTTAATGC 6000 

ATGACAAACT ACAACTAATG GTTGTCCATG ATAACCGACA TGCTCATATC TCAAACGCAA 6060 

40 GTnATCTATG ACTTCCCCAG ATTCTGTAAT AAATTCCCCT AAATTTAAAG TATCTACTGT 6120 

GTAATTTGTC ATTGTTCTTT CCTCCTTAAA CAAAAAAACT TCTCACCCTA TTGAAAAGTA 6180 

AGAAGTCTTT ATACTTATCA TTCGAGTAAC TCGTTGGTTT TAGCACCGTG CTATAAAGTC 6240 

GGTTGCTGAA GTATCACAGG G 6261 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 
SO (A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

55 
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TTCTTTTGGC 


ACGACATAAT 


TGTCTTTATC 


TTGAACTAAA 


TATCCGCCAG 


ATACTGAAAC 


180 


AAACTCTTCT 


TCGTTACTGT 


CTATAGTCAT 


ATCAATTTCT 


AATAATCTTA 


CATTCTTCTT 


240 


TTGTTTTAAA 


ATATCTAATG 


CTTCATCTGT 


AAATTTTGGT 


GCAATAATGA 


CTTCCAAAAA 


300 


GATACTATGC 


AATTGCTCTG 


CTAACTCAGG 


TGTTACAGCT 


CGGTTTAATG 


CAACAATTCC 


360 


ACCAAATATT 


GATTGACTAT 


CCGCTTCATA 


CGCATGTTGA AATGCTTGTT 


CTATCGTGTC 


420 


ACCGATACCA 


ACACCACATG 


GATTCATGTG 


X J. l/l/\^^ur\«.>\ 


ACTGTAGCAG 


GTGTATCAAA 


480 


CiTlTTAACT 


AAAGCTAGTG 


TAGCATCTGC 


ATCTTTAATA 


TTGTTATAGC 


TTAATTGTTT 


540 


CCCATGTAAT 


TGTTTAGCGC 


CTGCAATCGT 


GTGCTTAGCA 


TTCGAAGTTC 


TCACAAAATA 


600 


CGCTGATTGT 


TGTGGATTTT 


CTCCATATCT 


TAAAGTTTCT 


TTATCCCCTT 


TAAAGAAACX^ 


660 


TACAATCGCT 


TCATCATATT 


CTGCAGTATG 


CTCAAAAACT 


TTAATCATTA ATGATTGTCT 


720 


ATATGACTCA 


TCTAACGAAT 


CGTTTCTTAA 


TCGCGTCAAT 


ACTTCTTGAT AATCTGCCGG 


780 


ATGTACAATT 


GTTGTTACAT 


GTTTATAGTT 


TTTAGCTGCA 


GCACGTAACA 


TTGTTGGACC 


840 


ACCAATATCA 


ATATTTTCAA 


TTGCTTCGTC 


CATCGTCACA 


TCAGGGTTTG 


CAACAGTTTG 


900 


TTGGAATGGA 


TATAAATTAA 


CTACTACCAT 


ATCAATTAAA 


TCTATATGTT 


GTTCTGATAA 


960 


TTCATTTAAA 


TGCTGCGGTT 


TATTTCGATC 


AGCTAAAATG 


CCACCATGAA 


CAGCCGGATG 


1020 



T 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



40 - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



45 



SO 



TCATTCACTC CTAAATTGTT ATTACACTAT 


TACACaTAGC 


TAATCATCAA TGTGAAATCA 


60 


CCTTCAAAGA CACTATCCAA ATCTTCAGAA 


GTCAAAATAA 


AGTTTGTACC AGTAGTCAGT 


120 


TTGAAAATTT CACCATCGAC AATCATTTGC 


CCTTCGCCTT 


CCAACACTGT AACTAAACAG 


180 


AACTCTCTAG GCTTCATATA ATTTAACGTG 


CCAGAAATTT 


CCCATTTAAC CAATGTAAAG 


240 


AAATCATTCG ATACAATGTG TGTACACTTA 


TGGTTTTCAA 


TAATTTCGCT TTCAGGCAAA 


300 


ATATTAGGTA ATGGTGCATT GTACTGAATA 


ACGTCTAAAG 


CTTTTTCAAT ATTTAACGGT 


360 


CTATCATTAT ATTGATTATC TTGACGATTG 


AAATCATAAA 


GTCTATATGT AATGTCTGAC 


420 
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ATAAAAtAGa ATTCyCCAGG kTTTACtTTA 
TGTTGAACAT GATTCGCAAC TTCTTCTCTA 
^ GCATCTTCTT CTGCATCTAT AATATACCAA 
TGCTCATAAG CATAAGAATT ATCAGGGTGC 
ACTATTTTAG TTAGAAGCGG AAAATCTTTG 

10 

TCTGACCAAA TACGGTCTAA TGTTTGACCT 
CCATTTGGAT GTGCTGACAC ACACCAACAT 
TCCAAACTCA CTTAGACGTT GACCGCCCCA 
TAATGGCATT GTTGCACCTC CATTGTGATT 
TCCATTATAT TTTGATTTTG TTCTCATTTA 
SO TAACTATTAG TGATTGTACC ATATTTACTA 
CACTTAAATT TACAGTACTT TAACATTTTC 
CTTACATTTG TACATATTTC CCTTTAAATT 
CTTTAATAGT TGTGCCATAC ATTGTTCAAA 
TTATTCATAC TTATAATTCA TCATTTTCAA 
TTCAAATCAT ATTTACTATC CTTATTAATC 

30 

TTTAATGTCC TGATCACCAC TAATAATTTG 
6ACAATTTCT TTTAATACTG TCGCAACATC 
ATATTGTGCA GCTTCTATCT TTCCAGATCC 

35 

AATTGTATAA TTCAAACCTG nAACGTCTTA 
TATATGGCTT TAAATCACCG CTATCATCAA 

40 CCATGACATA 6TGTTTAATA TTGGCCTCTT 
CTAAATCGAC AATAATTGTT TTATCTGCAC 
TAACTTTATC GAATGGTTTA AACGTCTCAG 

^ CAACAAGAAT TGCTTTCATA CCTTGTGATT 
CACCAGCAGT AAATGGTACA TTTTCTTTTG 
CGCCATTAGC ACCTATAACC AAAATATTCA 

SO 

ATGCCATACC ACTTTATGAG ATATGTAAAA 
ACTACTGGGA ACGTATTAAA TTAATATATG 

55 



AtatATCyAA gTAtCGaCtC tATCGTTCCG 540 

6ACTCTGCTA ATGTCCCtAT AACTATTTCT 600 

CATTCAGATT TGCCATATTG CCCgTTTTCA 660 

ACATGAATAG AAAGTGATTC TCTTGCATCC 720 

CTTGGGAAAT CACCAAACAA TTCACGATGT 780 

TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

TCCCCCAGTT GTATCATTGT CTAATTGATA 900 

TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 1020 

CATCGTATTA TTAACTTCCA CATTTCAAAT 1080 

ACATTGCAGT ACTGCCAATT AAAAGnGCTT 1140 

AAAAATTTAT AGCATAGAGA TTATATCTCT 1200 

TACTCGCCCA TTATACCAAT TAATAaACAA 1260 

TTCTTTGTAA AACGCATAGA CAATACGTAC 1320 

AAAATAACGA GTTACGAAAA AGTAACCCGC 1380 

CGTTTCATTT TCAAATTGAG TTAAAGCATC 144 0 

AAACTCTTGG TGATTAAAAT GATTGGATGT 1500 

TTCTCTAGGA ATTTCACCTT TACCATCAAA 1560 

TGCTGCATTT GTAAGTGCCC CTGQATGTAA 1620 

AATAGTCATC AGCGTAATGT TTAGCTATTG 1680 

AAGCCTGACG TCTCGAATCA TATGTTGAAA 1740 

TACTCGCAAT CATTGATTTA ACAGCACCAT 1800 

CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 1860 

TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

TTAACGCATT AAGTTGATCT GATTGCCTAA 1980 

CTAATTGTTG CACTAGTAAC GAACCTACAC 2040 

TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

AACAAATATT CATATGAAAG GATT6TCATA 2220 
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tCaAGGCATT 


AGcGATTACA 


ATCGAATACG 


TATCaTGGAA TTGTTATCaG TCAGCGAAgC 


2340 


AAGTGTT6GT 


CACATTtCAC 


ATCAATTQAA 


TTTATCTCAA TCAAATGTCT C6CACCAATT 


2400 


AAAATTACTT 


AAAAGTGTGC 


ATCTTGTGAA 


AGCAAAACGA CAAGGCCAAT CAATGATTTA 


2460 


TTCATTAGAT 


GACATCCACG 


TAGCAACTAT 


GTTAAAGCAA 6CCATACATC ACGCGAATCA 


2520 


TCCTAAAGAA 


AGTGGGTTAT 


AATATGTCTC 


ATTCACATCA TCATCATGAC CATATGCATA 


2580 


GTCATGTAAC 


TACAAATAAT 


AAGAAAGTAT 


TGTTTATATC GTTTTTAATA ATCGGTCTAT 


2640 


ATATGTTTAT 


CGAAATCATC 


GGCGGTCTCC 


TTGCTAACAG CTTGGCATTA CTATCTGACG 


2700 


GTATCCATAT 


GTTTAGCGAC 


ACATTCTCAT 


TAGGTGTTGC ACTTGTCGCA TTTATTTATG 


2760 


CTGAAAAGAA 


TGCCACAACT 


ACAAAAACAT 


TTGGTTATAA ACGTTTCGAA GTACTCGCAG 


2820 


CGTTATTTAA 


CGGTGTAACG 


CTTTTTGTAA 


TAAGTATTTT GATTGTTTTT GAAGCGATTA 


2880 


AACGTTTCTT 


TGTTCCTTCT 


GAAGTTCAAT 


CAAAAGAAAT GTTAATCATT AGTATTATCG 


2940 


GTTTAATTGT 


CAATATCGTT 


GTTGCATTCT 


TTATGTTTAA AGGCGGCGAC ACTTCACACA 


3000 


ATTTAAATAT 


GCGTGGTGCT 


TTTCTACATG 


TTATCGGAGA CTTATTAGGT TCAGTTGGCG 


3060 


CCATTACTGC 


AGCTAkTTTA 


ATTTGGGCAT 


TTGGATGGAC AATCGCCGAT CCTATCGCAA 


3120 


GTATTTTAGT 


TTCCGTTATT 


ATTTTAAAAA 


GTGCTTGGGG TATCACAAAA TCTTCAATTA 


3180 


ACATTTTAAT 


GGaAGGCACA 


CCAAGTGATG 


TTGATATAGA TGAAGTTATA ACTACTATTA 


3240 


AAAAGGATTC 


ACGAATACAA 


AGTGTGCATG 


ATTGCCATGT TTGGACAATT TCAAATGATA 


3300 


TGAATGCATT 


GAGTTGTCAT 


GTTGTTOTAG 


ACCATACATT GACAATGAAA GAATGTGAAT 


3360 


TATTATTAGA 


AAaCATTGAG 


CATGATTTAT 


TACATTTAAA TATTCACCAT ATGACTATTC 


3420 


AATTAGAAAC 


GCCTAATCAC 


AAACATGATG 


AATCGATTAT ATGTTCAGGA ACACATAGTC 


3480 


ATTCACATAA 


CCATCATGCT 


CATCATCACG 


CGCATGTACA TTAATAATTT TAACCTACTG 


3540 


CCATTGCATC 


GATTAAACTT 


TTCAATGGCA 


GTAGGTTTTT TATGTCTTTA TGGCGACTTG 


3600 


TTTGGTCTTT 


GATGATGCAA 


TGTTTATTAA 


CAAATTTTCA ACTATTATTT CTTACATTAG 


3660 


TCATATTTTT 


GACAATTTAC 


TATTATAATT 


CTCTAACTTT AGTCACTTTA ATTAATTTTT 


3720 


ATTAGATATT 


AATATGAAAA 


TAACGTGTTT 


TTTGTTATT 


3759 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 





TAATTATCGC GCATAACAAA ACATTAGCAG GACAATTATA TAGTGAGTTT 


AAAGAATTTT 


60 


5 


TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATThTCAn 


CCAGAGGCAT 


120 




ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGCCTC AATCAnTGAT 


GAAATTGATC 


180 


10 


AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT 


ATTATTGCTA 


240 


GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA 


GTAGTAAGTG 


300 




TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTc 


AGATGTGCAA 


360 


IS 


TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG 


TGATGTAGTG 


420 


GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT 


CGGCGATGAG 


480 




ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA 


AAGAGAACAT 


540 


20 


TTTGCGATAT TCCCAGCTTC TCACTTCGTA ACAGGTGAAG AAAAGTTGAA 


AGTTGCGATT 


600 




GAACGTATTG AAAAAGAATT GGAAGAACX3A TTQAAAGAAT TACGAGATGA 


GAATAAATTA 


660 




CTAGAAGCGC AAAGGTTAGA ACAGCGTACC AACTATGATT TAGAAATGAT 

r 


GCGAGAGATG 


720 


25 


GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC 


ACTGGGTTCG 


780 




ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT 


TGATGAATCA 


840 




CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG 


TAAACAAGTT 


900 


30 


TTGGTGGATC ATGGGTTTAG ATTACCGAGT GCATTAGATA ACCGTCCACT 


TAAATTTGAA 


960 




GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG 


ACCATACGAA 


1020 


35 


ATTGAACATA CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT 


ACTGGATCCT 


1080 


AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA 


AATTCAAACA 


1140 




AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG 


AGTGAAGATT 


1200 


40 


aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG 


AAATCAAGAC 


1260 


ATTAGAACGA ATTGAAATAA TTAGAGACTT ACGAATGGGT ACATATGATG 


TTATCGTAGG 


1320 




TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG 


TCATATTAGA 


1380 


45 


TGCAGATAAA 6AAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA 


TAGGTAGAgC 


1440 




TGCGCGTAAC GATAAaGGTG AAGTCATTAT GTATGCCGAT AAAATGACTG 


ATTCGATGAA 


1500 




GTATGCAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA 


ATGAAAAACA 


1560 


50 


TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG 


CTACTGTTGA 


1620 




AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA 


CGAAAAAAGA 


1680 



55 



258 



EP 0 786 519 A2 





TTTCGAGAAA GCTACAGAAT TAAGAGATAT 


GTTATTTGAA 


TTAAAAGCAG 


AAGGGTGACA 


1800 




AGTAAATGAA AGAACCATCC ATAGTA6TAA 


AAGGTGCTCG 


TGCGCATAAC 


TTGAAAGATA 


1860 


5 


TTGATATCGA ACTACCTAAA AaTAAATTAA 


TTGTTATGAC 


AGGTTTATCT 


GGGTCAGGTA 


1920 




jk R^^^rp^jk'iwn ll^^9k^W^/**ikiv A^^m9k«mkfT«ikf«*M 

AATCGTCATT AGCATTCGAT ACTATATATG 


CTGAAGGACA 


ACGACGTTAT 


GTTGAATCAT 


1980 


10 


TAAGTGCCTA TGCGCGTCAA TTTTTAGGCC 


AAATGGACAA 


ACCAGATGTT 


GATACAATTG 


2040 


AAGGATTATC GCCAGCAATT TCAATAGATC 


AAAAAACAAC 


AAGTAAAAAT 


CCAAGATCAA 


2100 




CTGTAGCAAC AGTAACAGAA ATATATGATT 


ATATACGTTT 


GTTATATGCA 


CGTGTTGGTA 


2160 


IS 


AACCTTACTG TCCAAATCAC 


AATATAGAAA 


TTGAATCGCA 


AACAGTACAA 


CAAATGGTTG 


2220 


ACCGCATTAT GGAATTAGAG 


GCACGTACAA 


AGATTCAATT 


ATTAGCACCT 


GTCATCGCTC 


2280 




ATCGTAAAGG TAGTCATGAA AAGCTAATCG 


AAGATATTGG 


TAAAAAAGGT 


TATGTACGTT 


2340 


20 


TAAGAATCGA TGGCGAAATT GTTGATGTAA 


ATGATGTACC 


TACTTTAGAT 


AAGAACAAGA 


2400 




ATCATACAAT AGAAGTTGTT GTAGACCGAT 


TAGTTGTTAA 


AGATGGAATT 


6AAACACGAC 


2460 




TAGCTGACTC TATAGAAACT GCCTTAGAGC 


TTTCAGAAGG 


ACAATTAACA 


GTCGATGTCA 


2520 


25 


TTGACGGGGA AGACCTTAAG 


TTTTCAGAAA 


GCCATGCTTG 


TCCTATATGT 


GGATTTTCAA 


2580 




TCGGAGAGTT AGAACCAAGA 


ATGTTTAGCT 


TTAACAGTCG 


TTTTGGTGCT 


TGTCCGACAT 


2640 




GTGATGGCTT AGGCCAAAAG 


TTAACAGTCG 


ATGTAGACTT 


GGTTGTTCCC 


GACAAAGATA 


2700 


30 


AGACGCTAAA CGAAGGTGCA 


ATAGAACCTT 


GGATACCGAC 


GAGTTCTGAT 


TTTTATCCAA 


2760 




CATTGTTAAA ACGTGTTTGT 


GAAGTTTATA 


AAATCAATAT 


GGATAAACCT 


TTTAAAAAGT 


2820 


35 


TAACAGAACG TCAACX5TGAT 


ATTTTATTGT 


ATGGTTCTGG 


TGACAAAGAA 


ATTGAATTTA 


2880 


CATTTACACA ACGTCAAGGT 


GGTACTAGAA 


AACGAACAAT 


GGTTTTCGAG 


GGTGTAGTTC 


2940 




CTAMATAAG TAGACGATTC 


CATGAATCTC 


CTTCAGAATA 


TACACGTGAA 


ATGATGAGTA 


3000 


40 


AATATATGAC TGAACTACCT 


TGCGAAACTT 


GTCATGGAAA 


GCGATTGAGT 


CGTGAAGCkT 


3060 


TATCTGTTTA TGTAGGTGGT 


TTAAATATTG 


GTGAAGTAGT 


CGAATATTCA 


ATCAGTCAAG 


3120 




CGCTGAACTA TTATAAAAAC ATTGATTTGT 


CAGAACAAGA 


TCAAGCGATT 


GCAAATCAAA 


3180 


45 


TATTGAAAGA AATTATTTCC 


CGACTCACTT 


fffwiwfwmk Ik Ik mik ik 

TTTTAAATAA 


TGTGGGACTT 


GAATATTTAA 


3240 




CGTTAAACAG AGCTTCAGGT ACACTTTCAG 


GTGGTGAAGC 


ACAACGTATT 


CGATTAGCAA 


3300 




CGCAAATTGG GTCGCGTTTG ACTGGTGTCT 


TATATGTATT 


AGATGAGCCA 


TCAATTGGAC 


3360 


50 


TGCATCAAAG AGATAATGAT 


CGATTAATTA 


ATACACTTAA 


AGAAATGAGA 


GATTTAGGAA 


3420 




ATACTTTAAT TGTAGTTGAA 


CACGATGATG 


ATACAATGCG 


TGCGGCTGAT 


TACTTAGTGG 


3480 
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AGGTAATGAA AGATAAAAA;^ TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3600 

AAGTACCTGA ATATCX3CAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 3660 

GCAACAATCT TAAAC3GGGTT GATGTGGACA TACCACTATC AATCATGACG GTTGTTACAG 3720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 3780 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3840 

AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3900 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3960 

CTAAAATTCG AGGATATCAA AAAGGGCX3TT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4020 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 4140 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA OU^TTTTTTG 4200 

AAAATATTCC TAAC3ATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 4260 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 432a 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 4380 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 4440 

ATGGTGATAC TGTTGTAATT ATTGAACATA ACCTAGATGT TATCAAAACA GCAGACTATA 4500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4560 

CCGAAGATAT TGCTCAGACA AAGTCATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4680 

TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 474 0 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4800 

GGQACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4860 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCA7GCA TAAGaAATAC 4920 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4980 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 5040 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 

CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGGCGATA TGAACCATGT AAATTAAGCA 5160 

50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5220 

TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 5280 
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ACATTAAAGT TAGATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 5400 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA 5460 

^ ATACAACTAT TAGGAACAAC GGAACTATCG TTTTACAATT TATTACCAGA TAAGGATCGC 5520 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACX3CCTG CAATTATTGT GACACX3TGGA 5580 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 5640 

10 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 5700 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 5760 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGGCAT 5820 

IS 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 5880 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 5940 

2^ ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6000 

TTGGAAAACT GGAACAAGCA AAAGTTATAT GACCGCGTAG GTCTTAATGA AGAGACGCTA 6060 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 6120 

25 GCGGTAATTA TTGAGGTCX?C TGCAATGAAC TATCX5ATTAA ATATCATGGG CATTAACACG 6180 

GCCGAAGAAT TTAGTGAAAG ATTAAAT6AA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 6300 

CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 6360 

TTgCACAACG TGCACTAGTT AAAGCAGGAT TACATAAAGA TACTTTAGTA GATATTATTT 6420 

TTTATAGTGC ACTATTTGGA TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 6480 

35 

CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG GCATGGTGGA ATAGCAATAC 6540 

ATGGTGGTTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 6600 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 6660 

40 

GCTGGGGTAA CTTTATGAAT CACGAGOCAC ATGGTGGATC GGTGTCACGC GCTTTTTTAG 6720 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC CAATATTATC 6780 

ATCCAACATT CTTATATGAA TCCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 6840 

TTCGT/UUICA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATTCAATTG 6900 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 6960 

SO TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 7020 

GGATTAAGTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT TCGAAAGTTT TTAAGAATGT AATTATCATT 7200 

GAATTTTCGA AATTTATTCC AAGTATGGTA CTGAAAAGAC ATATATATAA ACAACTTTTA 7260 

^ AATATTAATA TCGGTAATCA ATCGTCGATA GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

CCAGAACTGA TTACGATTGG TAGTAACAGT GTTATTGGTT ACAATGTAAC AATTTTGACG 7380 

CATGAAGCAT TAGTTGATGA ATTTCGTTAT GGACCAGTGA CGATAGGATC TAACACTTTG 7440 

70 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT ATAACGATTG GTGACAATGT AAAAGTTGCA 7500 

GCTGGTACGG TTGTTTCAAA AGATATACCG GATAATGGAT TTGCATATGG CAACCCTATG 7560 

TATATAAAAA TGATTAGGAG GTGACAATTT TATGGCGCAA AAGAATAATA ATGTAATTCC 7620 

15 

AATGACTTTT GATGATGCAT TTTATCGTAA AATGGCTAAA CAGAAGTTTA AACAAAGAGA 7680 

ATATAAACGA GCTGCTGAAT ACTTTGAAAA AGTGTTAGAA TTGTCACCTG ATGATCTGGA 7740 

2^ AATTCAAATT GATTATGCAC AATGTCTAGT GCAACTTGGT ATTGCTAAAA AAGCAGAACA 7800 

TTTATTTTAT GACAATATTA TTTATAATAG GCATCTAGAA GATAGCTTTT ATGAATTGAG 7860 

TCAGCTCAAC ATTGAAGTTA ACGAACCAAA CAAGGCATTC TTGTTTGGTA TTAATTATGT 7920 

25 TATTGTTAGC GACGACCAAG ATTATAGA6A TGAATTAGAT CAAATGTTTG ATGTGAAATA 7980 

TCAAAGTGAA GAACAAATTG AACTTGAAGC TCAATTGTTT GTAGTTCAAA TACTATTCCA 8040 

ATATCTTTTT TCTCAAGGTC GATTAAAAGA TGCAAAGAAT TATGTCTTAC ATCAACCACA 8100 

AGAAGTTCAA GATCATCGTG TAGTACGTAA TTTATTGGCA ATGTGTTATT TATATCTCGG 8160 

TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT 8220 

ATATGCATTA TGCCATTATA CTTTGCTACT TTATAACACT AAGGAAAATG AACAATATCA 8280 

35 

AAAATATTTA AAAATATTAA ACAAAGTTGT ACCTATGAAT GACGATGAAA GTTTTAAATT 8340 

AGGXATTGTA TTAAGTrATT TAAAGCAGTA TCGTGCATCA CAACAATTGT TGTACCCTTT 8400 

ATATAAAAAA GGGAAATTTT TATCAATTCA AATGTACAAT GCTTTAGCAT ATAATTATTA 8460 

40 

TTATTTAGGT GAAGAAGACG AAAGTCATTA CTACTGGGAT AAATTGAAGC AAATTTCTAA 8520 

AGTGGAAATT GGACATGCGC CTTGGGTAAT TGAAAATAGC AAAGAAGTTT TTGACCAACA 8580 

^ TATTTTGCCA TTACTTCAAA GTGATGACAG TCATTATCGT TTATATGGTA T T TT T TTATT 8640 

GGATCAATTA AATGGTAAAG AAATTGTGAT GACGOAAAGT ATTTGGCAGG TTTTGGAAAA 8700 

TCTAAATAAT TATGAGAAAT TGTATTTAAC GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8760 

SO ATTAGACTTC ATTCATCGCG GCTTATTAAC GCTTTACCAT AATGAATTAT TTGTAAGTGA 8820 

AAATGATGTA ATGGTTGCAT GGATTAATCA AGGTGAACTC ATAATTGCTG AAAAAGTAGA 8880 
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ACTCGCATTG CCTGTAGAAT TTCTTTTCGA AATTCTCTGT 


GTTGGGGCCC 


CTGACTAGAG 


10800 




TTGAAAAAAG CTTGTTGCAA GCGCATTTTC ATTCAGTCAA 


CTACTAGCAA 


TATAATATTA 


10860 


$ 


TAGACCCTAG GACATTGATT TATGTCCCAA 6CTCCTTTTA 


AATGATGTAT 


ATTTTTAGAA 


10920 




ATTTAATCTA GACATAGTTG GAAATAAATA TAAAACATCG 


TTGCTTAATT 


TTGTCATAGA 


10980 


10 


ACATTTAAAT TAACATCATG AAATTCGTTT TGGCGGTGAA 


AAAATAATGG 


ATAATAATGA 


11040 


AAAAGAAAAA AGTAAAAGTG AACTATTAGT TGTAACAGGT 


TTATCTGGCG 


CAGGTAAATC 


11100 




TTTGGTTATT CAATGTTTAG AAGACATGGG ATATTTTTGT 


GTAGATAATC 


TACCACCAGT 


11160 


IS 


GTTATTGCCT AAATTTGTAG AGTTGATGGA ACAAGGAAAT 


CCATCCTTAA 


GAAAAGTGGC 


11220 




AATTGCAATT GATTTAAGAG GTAAGGAACT ATTTAATTCA 


TTAGTTGCAG 


TAGTGGATAA 


11280 




AGTCAAAAGT GAAAGTGACG TCATCATTGA TGTTATGTTT 


TTAGAAGCAA 


GTACTGAAAA 


11340 


20 


ATTAATTTCA AGATATAAGG AAACGCGTCG TGCACATCCT 


TTGATGGAAC 


AAGGTA7UUVG 


11400 




ATCGTTAATC AATGCAATTA ATGATGAGCG AGAGCATTTG 


TCTCAAATTA 


GAAGTATAGC 


11460 




TAATTTTGTT ATAGATACTA CAAAGTTATC ACCTAAAGAA 


TTAAAAGAAC 


GCATTCGTCG 


11520 


2S 


ATACTATGAA GATGAAGAGT TTGAAACTTT TACAATTAAT 


GTCACAAGTT 


TCGGTTTTAA 


11580 




ACATGGGATT CAGATGGATG CAGATTTAGT ATTTGATGTA 


CGATTTTTAC 


CAAATCCATA 


11640 




TTATGTAGTA QATTTAAGAC CTTTAACAGG ATTAGATAAA 


GACGTTTATA 


ATTATGTTAT 


11700 


30 


GAAATGGAAA GAGACGGAGA TTTTCTTTGA AAAATTAACT 


GATTTGTTAG 


ATTTTATGAT 


11760 




ACCCGGGTAT AAAAAAGAAG GGAAATCTCA ATTAGTAATT 


GCCATCGGTT 


GTACGGGTGG 


11820 


35 


ACAACATCGA TCTGTAGCAT TAGCAGAACG ACTAGGTAAT 


TATCTAAATG 


AAGTATTTGA 


11880 


ATATAATGTT TATGTGCATC ATAGGGACGC ACATATTGAA 


AGTGGCGAGA 


AAAAATGAGA 


11940 




CAAATAAAAG TTGTACTTAT CGGTGGTGGC ACTGGCTTAT 


CAGTTATGGC 


TAGGGGATTA 


12000 


40 


AGAGAATTCC CAATTGATAT TACGGCGATT GTAACAGTTG 


CTGATAATGG 


TGGGAGTACA 


12060 




GGGAAAATCa GAGATGAAAT GGATATACCA GCACCAGGAG 


ACATCAGAAA 


TGTGATTGCA 


12120 




GCTTTAAGTG ATTCTGAGTC AGTTTTAAGC CAACTTTTTC 


AGTATCGCTT 


TGAAGAAAAT 


12180 


4S 


CAAATTAGCG GTCACTCATT AGGTAATTTA TTAATCGCAG 


GTATGACTAA 


TATTACGAAT 


12240 




GATTTCGGAC ATGCCATTAA AGCATTAAGT AAAATTTTAA 


ATATTAAAGG 


TAGAGTCATT 


12300 




CCATCTACAA ATACAAGTGT GCAATTAAAT GCTGTTATGG 


AAGATGGAGA 


AATTGTTTTT 


12360 


SO 


GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC 


GTGTGTTTTT 


AGAACCTAAC 


12420 




GATGTGCAAC CAATGGAAGA AGCAATCGAT GCTTTAAGGG 


AAGCAGATTT 


AATCGTTCTT 


12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 12600 

GAAACA6ATG GTTATAGCX3T GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12780 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12840 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 12900 

CCTTTCX;TAC CAAGTGATAA ACGThAATAA TATAGAACGT AATCATATTA TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 

AGACGT 13086 
2Q (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 
2S (D) TOPOLOGY: linear 



IS 



30 



35 



40 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 60 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 120 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 180 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 240 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCfCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 360 

CTATTTAGTG AACTTTTTAA GOTTGTQCAC TCTTTTAATO TCTGCCAATT AGGTCAATTA 420 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 480 

45 CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 540 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 660 

SO TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 780 



55 



265 



EP 0 786 519 A2 



ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 1020 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT lOflO 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 1140 

10 

CGCTGCCTTT TGACCATCAT ATCTTACAGC TATTGGTAAG AAATGGAACA TTAAGTTAGG 1200 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 1320 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
(2) INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

2S 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



30 



3S 



40 



TAATGCTATT 


GGCAACACCA 


TATATGAAAn 


CTCCAAACGA 


TCCTAAACCG 


ACTATAGATT 


60 


CACCAAATTT 


nACAATCCAT 


GAATAAAGTA 


GTGGCCATAA 


GAATAACAAT 


ATGACAACTA 


120 


AAAATGTACA 


GTAAAATGCA 


GTCATAATTG 


GAACTAGACG 


TTTACCACTA 


AAAAATGATA 


ISO 


ATGCTAATGG 


TAATTCTGTT 


TCACTAAACT 


TATTGTATGC 


ATAAGCTGCT 


ATTAAACCTA 


240 


TTACAATACC 


AACAAAGACA 


TTGCCATTAT 


TCATCTTTTC 


AAAAGCTGAA 


TTTATTTCCG 


300 


ArGCTTTCAT 


TCCTAATAAA 


GGCGCTAATT 


TCATTGGTGA 


TAATACAACT 


GTAACTAAAA 


360 


AATATCCTAA 


CGTrGCTGCA 


rGCGsGACTG 


CACCATCATT 


TTTCTTTGCC 


ATTCCTATAG 


420 


CTACACGAAT 


TGCAAATAAA 


ATACCTAATT 


GCTCTAAAAT 


CGTAGTACCT ACCGTAGTAA 


480 


AGAACATTGC 


GATTTTCGGC 


GTCGCATGAA 


GTGCATTTAA 


CGTATTACCA ATTCCGGCAA 


540 


TAATTGCTGC 


A6CCGGTAAA 


ATGGCAACTG 


GTAACATTAA 


CGAACGCCCT AAATTTTGGA 


600 


AAAATTTATA 


CATTGAATGT 


CATCCTTCTT 


AAAATAATGT 


AGAAATATAA 


AGATTACTAA 


660 


TGTAACTAGA 


ATAACTACTT 


CGATACTCCG 


TTATAGTCAC 


CTAGGCTTAC 


TAACCAGCTA 


720 


TATTTCTACC 


TCAAGTTATT 


TTATAAACTT 


TTTACAATTT 


CATGCAATTC 


TTGTTGTAAC 


780 


TTTGCTGTTC 


GTGTTTCAAT 


CTCTTTTGTA 


ATATAATCGA 


TAOGCTCGTT 


TCGTTTTAAA 


840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 960 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

5 CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 1080 

TACTAGGTCC TCGATTTTCA CGATCTACAG GAAATOCATT TAAAGACGTT AAAAATTTAC 1140 

CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 1200 

10 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 1260 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 1320 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 

IS 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: double 
(0) TOPOLOGY: linear 

2S . (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
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35 



40 



45 



SO 



TTGTCATACC 


AATATTTTGT 


AAAATATGGA 


ACACAAGTAA 


AGTGACGAAA 


CCAACGATAA 


60 


AGATTTTGTT 


AAATTGATCT 


TCAATTTTCG 


CAGCTAATCT 


TATTAGATGG 


AAGATTAAAA 


120 


ATAAAAATAT 


TAAGATCAAT 


ATGACAGAAC 


CGATAAAGCC 


AAGTTCCTCT 


CCAATCACTG 


180 


AAAAGATAAA 


GTCAGTATGA 


TTTTCAGGTA 


TATAAACTTC 


ACCGTGATTG 


TATCCTTTAC 


240 


CTAGTAACTG 


TCCAGAACCG 


ATAGCTTTAA 


GTGATTCAGT 


TAAATGaTAG 


CCATCACCAC 


300 


TACTATATGT 


ATAGGGGTCA 


AGCCATGAAT 


TGATTCGTCC 


CATTTGATAC 


AGTTGGaCAC 


360 


CTAATAAATT 


TTCAATTAAT 


GCGGGTGCAT 


ATAGaATACC 


TAAAATGACT 


GTCATTGCAC 


420 


CAACaATACC 


TGTAATAAAG 


ATAGGTGCTA 


AGATACGCCA 


TGTTATACCA 


CTTACTAACA 


480 


TCACACCTGC 


AATAATAGCA 


GCTAATACTA 


ATGTAGTTCC 


TAGGTCATTT 


TGCAGTAATA 


540 


TTAAAATACT 


TGGTACTAAC 


GAGACACCAA 


TAATTTTGAA 


AAATAATAAC 


AAATCACTTT 


600 


GGAATGATTT 


ATTGAATGTG 


AATTGATTAT 


GTCTAGAAAC 


GACACGCGCT 


AATGCTAAAA 


660 


TTAAAATAAT 


TTTCATGAAT 


TCAGATGGCT 


GAATACTGAT 


AGGGCCAAAC 


GTGTACCAAC 


720 


TTTTGGCACC 


ATTGATAATA 


GGTGTAATAG 


GTGACTCAGG 


AATAACGAGC 


AAGCCTATTA 


780 


ATAATAGACA 


GATTAAGAAA 


TACAATAAAT 


ATGTATAATG 


TTTAATCTTT 


TTAGGTGAAA 


840 


TAAACATGAT 


GATACCTGCA 


AAAATTGCAC 


CTAAAATGTA 


ATAAAAAATT 


TGTCT6ATAC 


900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC 1020 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCCTTT TATACTCACT AATGTTTATA 1080 

TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AGCTATAGAT 1140 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 1200 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 1320 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 1380 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTACCAATG 1500 

ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC 1S60 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGTCCTAA TGGTGAAGAT GGCACGATTC 1680 

AAGGGCTTTT TGAAGTTTrG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 1740 

25 GTTCtATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC 1800 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG 1860 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG TATTAAAGAA GCATTCCAAT 1980 

TTGACCGTAA GCTTGTTATA GAACAAGGCG TTAACGCACG TGAAATTGAA GTAGCAGTTT 2040 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

ACQATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA GCTGACTTAG 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 2280 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2340 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC 2400 

AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT 2460 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 2520 

AGAGATAAAT GGAGTCACAA TTGATTCACG A6CAATTTCT AAAAATATGT TATTTATACC 2580 

SO ATTTAAAGGT GAAAATGTTG ACGGTCATCG CTTTGTCTCT AAAGCATTAC AAGATGGTGC 2640 

TGGGGCTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GGCCTATTAT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC 

TGAAAGTGTA TTGCATACCG AATTTAAAGT 

5 

AATTGGTTTA CCTTTAACTA TTTTGGAATT 

GATGGGGATG TCAGGTTTCC ATGAAATTGA 

TGCAGTTATA ACTAATATTG GTGAGTCACA 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG 
TGGCGATGAA CCATTATTGA AACCACATGT 
TATTGGTGTT GCTACTGATA ATGCATTAGT 
TATTTCATTT ACGATTAATA ATAAAGAACA 
TATGAAAAAT GCGAC6ATTG CCATTGCGGT 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT 
AGAAAATGAT ATTACTGTGA TAAATGATGC 
AGCTATTGAT ACACTGAGTA CTTTGACAGG 

2S AGAATTAGGT GAAAATAGCA AAGAAATGCA 
GCATATAGAT GTGTTGTATA CGTTTGGTAA 
GCAACATGTC GAAAAAGCAC AACACTTCAA 
AAACGATTTA AAAGCGCATG ACCGTGTATT 
AGAAGTGGTA AATGCTTTAA TTTCATAGAG 
TGATTTGAAT TAATACTAAA AGATTACAAA 

35 

TTGCCTTTTT CTTTTTATGT TAAATCTATA 
GTACACACTT TATATAGGAA GTAGTTTGAA 
GTATTATAAT GTCTAATTTC ACATGTGTTT 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT 
CTTATCTAAT GCTAGCTTTT TGACAAAAAT 

45 TATTCGCAAA TTGCTTTATT GCGATTAAAT 
AAATATTAAT GAACTTATAT GCAAAAGTAT 
TATTTTGCAA AATTTTAAAG AACTAGGGAT 

SO AATGGGATTT AAAGAGCCGA CACCTATCCA 
AATTGATATC CTTGGGCAAG CTCAAACCGG 

55 



TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TAAGAAAACG CAAGGTAATT ACAATAATGA 2880 

AGATAATGAT ACTGAAATAT CAATATTGGA 2940 

ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TAAAGAAGTT GAAAATGCAA AATGTATTAG 3180 

TTGTTCTGTT GATGATAGAG ATACTACAGG 3240 

TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

AACTGGTATG CGTATGGAAC AACATACATT 3420 

CTATAATGCA AGTCCTACAA GTATGAGAGC 3480 

GCGTCGCATT CTAATTTTAQ QAGATGTTTT 3540 

TATCGGTGTA GGTAATTATT TAGAAGAAAA 3600 

TGAAGCGAAG TATATTTATG ATTCGGGCCA 3660 

TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

AGTTAAAGGA TCACGTGGTA TGAAATTAGA 3780 

ATTAGTCGAG GGACCTTTTA CTTATAAAAA 3 840 

GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3900 

AATTTGAAAC TAAATCAAGG TTAATTCTAT 3960 

TGTTTATATA ATGTTTTACA AAAAGATGTA 4020 

CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

GATTATTCGA ATGATGCGTA ACGCTTACAT 4140 

ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4320 

TTCGGATAAT ACGGTTCAGT CACTTGAATC 4380 

AAAAGACAGT ATCCCTTATG CGTTACAAGG 4440 

TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT 
AGTTGTTACT GTATTCGGTG GTATGCCTAT 

5 

CCCACAAATC GTAGTCGGAA CACCTGGGCG 
AAAAACGGAC GGAATTCATA CTTTGATTTT 
ATTCATCGAT GATATGAGAT TTATTATGGA 

10 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT 
ACCAAAAATC ATTAAGACAA TGAATAATGA 
TACAATTGTT AAAGAATTAG AGAAATTTGA 

IS 

ACCTGAATTA GCAATCGTAT TCGGACWTAC 
TTTGATTTCT AAAGGATATA AAGCTGAAGG 

20 TTtAGAAGTA TTanAGAAAT TTAAAAATGA 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT 
AGATACTGAA AGCTATACAC ACCGTATTGG 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT 

CG6TAGAAAA ATGAGTGCAy TcGTCCACCA 
GATGACATCA AAGAAAAAGT TGAAAACTGG 
CGCATTTCTA CAGAGTTGTT AAATGAATAT 
CAAGAGTTAG TAGAAGCAAA CGATGAAGTT 
TCTCGCAAAG GCCGTAACGG TAAACCAAGT 

35 

AATCCTAAAT TTGACAGTAA GAGTAAACGT 
ACAAAAAAAT TCGACCG7AA AGAGAAGAGC 
ACATTTGCTG ACCATCAAAA ATAATTTATA 

40 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC 
GTTAAATATT TAATTGGATT GAGATCTGTA 

45 TCTCCACCAA ATGTGGTGAG TATATAATTT 

AACATAAATA AACTTTATGA AATTTCAGTA 
TTGctGACGC TAGTGCGCGA TAAATAATAA 

SO TTCGTTTAAT GGCAGCGATC TTTTTTATTT 

TAACCAATGC GTGGATGTGT TTTAATAATA 

55 
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AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 4620 

CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4680 

TGTTATCGAC CATTTAAATC GTCGCACATT 4740 

AGATGAAGCT GATGAAATGA TGAATATGGG 4800 

TAAAATTCCA GCAGTACAAC GTCAAACAAT 4860 

CCAAGCTTTA GTACAACAAT TTATGAAATC 4920 

AATGTCTGAT CCACAAATCG AAGAATTCTA 4980 

TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

AAAACGTCGT GTTGATGAAT TAACAAGTGC 5X00 

TTTACATGGT GATATTACAC AAGCGAAACg 5160 

CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340 

GGATTATATC AGACAAATTG AAGATGCAAA 5400 

CATCGTAAAG AAGTACTTCA AGCACGTGAA 5460 

ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5520 

AACGATGITG ATTTAGTTGC TGCACTTTTA 5580 

GAAGTTCAAT TAACTTTTGA AAAACCATTA 5640 

GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 5700 

TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

GATTAAGAGC TTAAAGATGT AATGTCTTGA 5880 

TGTAGATATC aAAGTaCTAA CGTTTTAAAG 5940 

TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

AATTATTTCT CTTTCCACTG CAACATTTGA 6240 

TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 6420 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 6480 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6540 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 6600 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 6660 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 6780 

^5 TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 6840 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATT6CATTA TACX3TCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 6960 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 7020 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 7140 

CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 7200 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 7320 

30 

TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 7363 

<2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
- (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TTAACAATCG ATAACCACAA TACTTCTATT OTAATTGTTT AACGATTTCn CGATTAAAAT 60 

4S CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 120 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGGCAAATGT 240 

SO TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 300 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 360 
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10 



IS 



ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 480 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 540 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 600 

AAACATACAA GATATAAAAA ATATTAAGCX? ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAGCTGATCT AACAATCCAA 720 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 840 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 1020 

20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAA6CAAC 1080 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCAT6AG 1140 

ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTQTT 1200 

2S TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 1260 

AACAACCACA CTCCTAAATT AATAG6TGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACOTTTA 1380 

GAATCTCTTC GGCAACTTTG CTATA6ACAG TCTATGCTGT TACTAAATTA TACCACCACA 1440 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CQAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGftGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 1680 

CATATTGCGC TACGCCAGTT TGTTT6TGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 1740 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

45 ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAA6CATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGG7VATTA TCAOTACTGC 2040 

SO CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 
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3S 



40 
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GrrrATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2340 

5 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 2400 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTT6CCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

10 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

OGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGA6T TACTGGGTCA AAACCAGTGT GTAAAACCGA 2820 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2880 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCX3CATTCT TGGAAAGTTG CCT6TTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3060 

AACTGTGTTG CCTGATACGT AACTATGCX3C AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 3360 

35 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 3420 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 3480 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3540 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTC6 AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

45 TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 3780 

TAATTTTGCT TGCATTGTCX3 CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 3840 

so CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTT6ATAA 4140 

^ ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4320 

10 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 4440 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4500 

IS 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4560 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4620 

2^ CCTTCGTAAA TAAACTTCTT TACATTTTTA 'AAATTACCTT CCATAAAAAT CACCCTTTAA 4680 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCT6TTCCTG 4740 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4800 

2S GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4860 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4920 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4980 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCOGTAT 5040 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTCATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

35 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 52-20 

CGTTTTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 5280 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 5340 

40 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 5400 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 5460 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 5520 

45 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC OAACATAACC AGCGTGTTTG ATAACCTTTT 5580 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 5640 

SO TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

GATCGATACG ACCTTGTTTG TCATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 5760 

55 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 5880 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5940 

5 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

10 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6240 

TGTTGTGCrr TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 6480 

TTGTTTATCX: AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 6540 

TTTAGATGCC 6AACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 6660 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 6720 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

30 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 6840 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT CATTTTTACT 6900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

35 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 7140 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 7200 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 7260 

^ TTTAGGTGCC GGTGTAGTTT TGTCTGGATQ ATATGGTGGT CTAACAAAAT ATTTAACCCC 7320 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 7380 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 7440 

SO AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7560 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA 
AGCGATATAT AACGCCCATT CAACCACTTC 
^ AGGTAATCCC ATGTATGCAC CTCATTTCAA 
ACTCTTAACT GTTATTTACA TTTACCAAAC 
ATCCCTTTAA GCATGGTAAT CACCTCCTTT 

10 

TGACAATCGT ACTGAAGATA GTCCCTATCA 
TATTCTTGGC ATTCTTTTCT TTATTCTTTT 
CAAAATTTCT ATCTAATTTG TCATAAATCT 

IS 

TGTCGAATTT TTCAAACATA GTCTTATCAT 
GTTCATGTCG TTTGGTAAAT CCAAACATTA 
ACAAGCATTA CACCTGTGAC TTTTCATCTT 

20 

AAGCGTATTC TTCTTTATCG ATTAAACCCT 
TATAGTAACC CCAAACATAA AAAGTTTTAA 

2s TTATTTAAAC GTCCCCCTCA GTACTTGTTT 
TTAACATAGC GTTTTGTTGA GCTAATTCCA 
TTTGCATACT CGCAACCATT CCGCGAAGTT 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT 
TAGTGAAAAC AAACTTTCTA GGTTCGAACT 
CATCTACATC TAAACTATTG CGTAAACCGC 

^ TATCGTTTAC TGTGATTTTC ATTATTTCCA 
GGCAJTCGCT CCAGAACCTG ATGTTTTACC 
TAAfiGTAGTG CTACTTGTTT TGGATAGTAA 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG 
TCCCTCAATA ACGCCACCTG GATAAGTTCC 
CGGTTCAGTT AGATTGATTG TTGTACCTAC 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG 
TGTGTAAATC TTTTTAGAGT TATAAGGTGT 
AACGAATACC GATAAATAAC CCTCATAACT 
TGTTGCATAG TAATTACCAG CAGTTAAATA 



ACCAGGTACA TCAATAGCTA TTTTGTTTTT 7680 

ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7740 

TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

CAGAAGCACG CCCAGAAGCT ATATCCTAAA 7860 

AAATACCAAA AACAGTTCTT AGTAAAGCTA 7920 

AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

CATCTTCTAC CTTGTCGCGC TTTAATTCTT 8040 

TTTCTTGCGC TCTAAGACTA TCTTCTATTC 8100 

TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 8160 

TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

TTGTTTCTGG ATATTTTTCT CCAGTGATTA 8280 

TGTCTACGTA CCACTTAATT TGCTCGTTTT 8340 

TGTCTTTAAA AGTTGGATAA ATCATCTTCA 8400 

TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 8460 

TTGTTAATAC GTTTACTTGT GCCACCTGCA 8520^ 

CCTCATCACT TAAATCTGAC GCACTTTGTT 8580 

CGAAATTGCT ATTGTATTTA ATTTCGCCGT 8640 

CTTCTTTAAA TTTAATAGGC ACATTGTTAT 8700 

CAGTATTAAC GAATCCGATA ACTTCGTTTT 8760 

CCCCATAATT TTAGTTATAG TAACTTTGTT 8820 

TAAATCAAAG TACACATCGT TATCTATTCT 8880 

GCACTCATAA ATACCGCCAC CGTTGCCGTC 894 0 

AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

GAAGTTAAAT AGCTTGTTTG TATCATCTTT 9240 

TTCAACGCTA CCTGGTAAAT CCGGCACTCT 9300 

TCCCAAATCG CCTTGCGCAT TATTTAAGTT 9360 
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5 



10 



IS 



20 





TCTACATACT 


GCTTAGCTT6 


ATTTAAAGCG 


TTGTTAGACG 


TTTCTTCAAC 


9480 


AAATTGCTTA 


GTTAAGTTTC 


CATCATTCTT 


TTTATAAAAC 


GGGTACCATG 


TGCCGTAGAT 


9540 


TTTGTATTTT 


GTGTACTCAT 


CGTTTGAATC 


GTCTGGGTAC 


CATGTTGCAC 


GAGCAGTATT 


9600 


ATTATCAACA 


ACATAAACA?^ 


CTAACACACC 


AGATTTGCTT 


GATGTATAAG 


TTGATTCATC 


9660 


GAACGAAGAA 


CCGTCATCAA 


CACCATCTTG 


TCCAGGCTTC 


TCTAACGTGC 


CTATATCCGT 


9720 


CTTTTCTGGC 


GCATCTGTTG 


CATTAGTAAT 


ATGAATAATC 


CTAGATGTGT 


TAACTGCGCT 


9780 


TAAAACGCTA 


TCTATGGACT 


GCTCATACGA 


TTCAATTGCT 


TTACCGTAAT 


CATCTGTAAG 


9840 


TTTAGACTTT 


TGCCAATTCG 


TTGTTGAATT 


ACCTTTAACA AGGTCAGCGC 


CATTGATTTG 


9900 


TTGTTCAACT 


TCGTTAACAC 


GTTCAAAAAT 


CGCTTGCTCT 


TTTTCAACTA 


TTTTATCGAA 


9960 


TTCAGCTGTA 


ACAGCTTGTG 


TTGCACTAGT 


TTGCGTCGCA 


GTAATAGCTT 


GTATAGCTTC 


10020 


GTTTTGCTTG 


ATTTCGATTT 


GTTGAATGCC 


TTTTGTCGCA 


CTATCATTCA 


CTTTTGCTAT 


10080 


TAACGTTTGT 


GTATCAGCCA 


TATTTTGCTT 


TAATTGGTTA AAATCTTTAC 


CGACAGCTTC 


10140 


GATAGTATCT 


TGAATAGATT 


TGATATAAAC 


AAGCTTTGTT 


ATACCATCAA 


ACCCACTAAC 


10200 


TAAATCATTT 


TCAATATTGA 


AGCTAAATTG 


ACGTTCAACA ACAACATTAT 


TACTCCCGTT 


10260 


TTGTGTAAAG 


AATGCCTGAG 


CATGCACCTT 


GCCTGAATGT 


TTTAAAAATT 


CATTCGGTAT 


10320 


CACATACTGC 


AAACGCCCAT 


TAATTGCGTC 


TACTATCGTT AATTCGTCTG 


AAATATAAGC 


10380 


GCCTCTATCT 


ACGTTATAAT 


CATCGGTTTT 


TAAnACGATA GAT6TTTTAA 


CATGTTCAGA 


10440 


ACTTATAGAT 


AAGGGTCTGT 


TATnCTTAGT 








10470 



(2) INFORMATION FOR SEQ ID NO: 21: 

3S 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 
* (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 60 

45 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 
AA6ATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 
SO AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 240 
TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 300 

SS 
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TCAAATTGTA ACAACTAATC CTATTGCAGG 
AGATAATGAG AATATGAAAC AACTACTTAA 
^ GCTAGTTGAT TTAGGACGTA ATGATATTCA 
TACTAAATTA ATGGTTATTG AAAAATATGA 
AGGTAAAATA AATCAAAATT TATCGCCAAT 

10 

TACCGTTTCA GGTGCACCAA AATTACX5TGC 
TAAACGGGGC GTTTATAGTG GTGGTGTTGG 
TGCATTAGCA ATTCGAACGA TGATGATAGA 

IS 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA 
AAGCTTATTG GAGGTGAGCC CATGATCTTA 
AACCTAGTGG ATATTGTTGC TCAACATACT 
AATGTGCTGA ATCAATCGGT GGACGCTGTT 
GACGATCAAC AGTTAATGAA AATCATATCA 

2S TGTTTAGGGG CTCAGGCACT GACTTGTTAC 
GTTATGCACG GCAAAGTTGA TACACTAAAG 
CAAGATATAC CAGAACAGTT TTCAATTATG 

30 AATTTTCCAG AAGAATTGAA AATTACTGGA 
CATAAAGAAA GACCGCATTA TGGTATTCAG 
G6TGTCAAAA TAATTACAAA TTTCATTAAT 

3S TACTAACAAG AATAAAAACT GAAACTATAT 
ATATfiCTTAT TTCTCCTAGT ATTGGAACTG 
CGGAGCGAGA AATCCAACAA CAAGAATTAA 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG 
AGTCAAATAG TTTCAACATT TCAACGACTG 
AAGTTATAAA ACATGGtAAT AAAAGTATTA 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG 
ACCTTGTATT CATTGGTGCA aCTGAATCAT 
5^ GAAAAATGAT TGGAAAGCCT ACAATATTAA 
ACTTAACGTA TCAAATGGTA GGC6TCTTTG 



TACGATTCAA CGTGGTGAGA CGACACAAAT 420 

TGATCCAAAA GAATGCAGCG AACATCGTAT 4 BO 

TAGAGTAAGT AAAATCGGTA CCTCAAAAAT 540 

ACATGTTATG CATATCGTAA GTGAAGTCAC 600 

GACAGTTATT GCGAATTTAT TACCAACAGG 660 

AATTGAAAGA ATATATGAAC AATATCCACA 720 

ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGAGCAGTAT ATCAACGTAG AAGCTGGTTG 840 

AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

GTTGTAGATA ATTATGATTC CTTTACATAT 960 

GACGTCATTG TTCAATACCC TGATGATGAT 1020 

ATTATATCTC CTGGTCCAGG GCATCCATTA 1080 

ACCTATCAAC ACAAACCCAT TTTAGGTATT 1140 

TACGGTGGAG AAGTCATTAA AGGCGACAAG 1200 

GTTATATCGC ATCATCAACA TCTGTTATAT 1260 

AGATATCATT CATTAATAAG TAACCCTGAC 1320 

CGTACCAAAG ATTGTATACA GTCATTCGAG 1380 

TACCATCCTG AATCATTTGC TACAGACTAT 1440 

CTAGTGAAGG AAGGATGAAA ACCATGACAT 1500 

TACTTQAAAG CGACATTAAA GAGCTAATCG 1560 

ATATTAAATA TGAATTACTT AGTTCCTATT 1620 

CATATATTGT ACGTAGCTTA ATTAATACAA 1680 

CTATGTGTGT GTGCGGCACA GGTGGTGACA 1740 

TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

CCTCaAATTC aGGTAGTACG GATTTGtTAA 1850 

ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ATCCAATCAT GAAGTATATG CAACCAGTTA 1980 

ACCTTGTGGG TCCATTAATT AATCCATATC 2040 

ATCCTACAAA 6TTAAAGTTA GTTGCTAAAA 2100 



55 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

^ GCX;GTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 2340 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 2460 

10 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2580 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 2640 

IS 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2820 

20 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2880 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2940 

2s TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3000 

TTAGAACGTG CCTATAAGGT TAAT6CTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3060 

CX5ATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCX3TT GTGACAATCT ATCTGAATTT 3240 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3300 

CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 80 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 
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(Xi) SEQUENCE DESCRIPTION: i 
CcAcCTTGAC CACCTTTACG TGGAATCTTT 
^ GAAAgTCAAC AAGTTCTGGA CTAAATGTTG 
AAATTAGTTC TATATC6TCA TGACGTTCTA 
ATGCATCTTC ACGAATCGCA GTATGAGGAC 

10 

TTTCAGGTAG AACTCCTGAA TTTACTAATA 
TTGTAATAAC GCCGATACTC ATTTCTTTTG 

js GTGTTTTACC TGCACCTACA GGACCACCAA 
TAACCTCCTA TGATATGAAA tTCTAACATT 
TAAACCAGGC GCTGTCATGC CAAAATCTGC 

20 TCCTTCTATA TAAGGAATCA TGTGAGTAAC 
AATAGCACGA ACAGCATTTT GAGTTAAACT 

AATCGTTTCA ATATCTACAC CTAAATGATG 

25 

TnACXTTGCT TTCTTATCTT GCATTTGTTG 

AAGTTCTAAA GCCAATTTAA CCATGCGAGT 

TTTAG6TAAG TTTTGrACAr ACATCAGTTT 

30 

ATTTTCCAAT GCATCATAAA CTAaACGCAT 
TTGTAAAAAC ATTTTTAACC AAGCAATAAA 
35 AATATATGTT TCAAGACCAA ATGAATGACT 
GAACTGAAAT AATCTTAAGT GTGTATGATC 
CCTEATTAAC TTTACGGTCT TCTCGAACAT 
CAACTAAATA ATCATATTGT ACTAGCATTT 
GATTTCCTAA TTGATGGGCT ATATCTCCCA 
AAAGATCTTC TGAATTAACA TCCACAATAA 

45 

CTCCATATTG TAAGTCAATA GGTTGTTTTA 
TAACGACTCT TTGAATACGT TTAACAAGAT 
5^ GCTTTTGTTT TTCTGAATTT GACAAATTGG 
TTCTATGTTC CTCCTAGAAT AAGAAGTATC 
TACTTGTAAT TTTTTCTCCA TCTACATATA 

55 



;EQ id NO: 22: 

TCmCCTkGAG CAACaTCGaT AATaTATATT 60 

CTGCTAAGTT ATCGCCACCA GATTCTATGA 120 

ATAATTCGTC TATTGCTGCA AAGTTCATAG 180 

ATCCACCAGT TTCAACACCA ATGATACGAC 240 

TCTTTTCGTC TTCTTTTGTA TATATATCAT 300 

AAAGACGTTT TACAACTTTT TCAATTAATT 360 

TACCAATTTT AATCGGATTT GCCACAATTA 420 

GaCGTTCTCA TGCGCCATTT GATTTA6TTC 480 

TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 540 

TATCTTTTGA CCAGCAGTTT GTCCAAGTGG 600 * 

TGAAACATTT TGATATAAAT AGTAATCAAT 660 

GCCTAGCATA GTAAAACAAA TAGCTGGATT 720 

ATGATACCAA GCAATCCATG GGCTATtATA 780 

CCCCATTTGT kTTGCACCAA CACGTGTTTC 840 

ATCTATGTGT AATACTTTTT GTGTATCATC 900 

GGCTAAACCA TCAGAATAGG TAAGTTGCTC 960 

AGTATGATCG TCATGAATTA TATTTCGTTG 1020 

GAAAGCACCT GTTGGAAACT GTGAATCACA 1080 

AATCATGAGA ATGCCCTATA TGTCTGAAAG 1140 

ATGGGATGCC TAAACTTTTT AATAAATCTT 1200 

CAGTCTCTGT AAATTGTGCT GGCAAATGAC 1260 

TTTCTTGCAA TGTTCTTGGT TGAATCACTA 1320 

TCATATTATG GTCATCTGCG TATAAAATAT 1380 

AACGAATGCC TATTTCAGTG CCATGGTCTG 1440 

CTGAATTTTC AAGGTATACT TTTTCGACGT 1500 

CAATATTGCC TTGGATTTCT TCAACAATCA 1560 

TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 1620 

CTTCATATGT TTGTGGATCA ACGTCTAATT 1680 
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GACGCACCAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT I860 

5 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 19B0 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 2040 

10 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 2100 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 2160 

IS GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 2340 

20 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAQAT GAAGGTAAAA 2400 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

2S 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 2640 

3^ CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACC6ACATT AATCGGTAAA CCTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2820 

55 TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 2940 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3000 

40 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 3120 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 3180 

45 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACXSAATGGAA TCTCCAACAG 3240 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 3300 

SO TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3360 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA TACCACGAAA 3420 

ACCAAAAATT TTACGTTTGC CAGCATATTC AACTAATTGA ACTTCTTTTT TATCCCCAGG 3480 

55 
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TTOGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3660 

5 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3780 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 3840 

TCATTAACTC TGCAACGGTC TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3960 

IS CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4020 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4080 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 4140 

20 

TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 4200 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 4260 

AGGTGAT^GA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 4320 

25 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 4 3 BO 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 4440 

gQ ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4500 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4560 

TATTAAGGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGACCATAT AAAGTTCCGA 4620 

55 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 4680 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 4740 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4800 

40 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4920 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4980 

45 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCQACTTTT TTAGGTGTGT 5040 

TATTAACAGT AGTGGTGCAA CTAGGTACT^ CAACATTGCT TGAACCXJTTT GGCTTACCTG 5100 

SO CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 5160 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 5220 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 5280 

55 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AA7ATGAATG 5400 

ATATGGATAA TTCCTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 5460 

AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACX? TTACCAAAAG 5580 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 5640 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 5700 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 5760 

GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 5820 

GACCAAGTAC ACATGCTGTT AAAGCTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 5880 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 5940 

20 AAAGTGTTGG TTTTTTCTTA GTAGAC 5966 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
^5 (A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 60 

3S ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 120 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TACCTAACTT 180 

AAAGSTGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 240 

AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 300 

AATCATACGA TATGTATACA AAATAATGAm AAACTGTmAA AAATGATTTG CCTTTAATAA 360 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 420 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 480 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 540 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 600 

TGTAATCACT GTCTATTAAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 660 

TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 720 

55 
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TTATTAAAAT AAACGTATCX5 TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 840 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTO TTGCTGACAA TCATTTAATG 900 

5 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 960 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 1080 

10 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TTAGCGTTTG CTTAACAGCG GGAACATCTG 1140 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACGCTAACT 1200 

IS GTTTATCGAA ATCTGAAACA CAAATAATGC TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCG 1380 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 1440 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGACCTACT AAATGTTCGA 1500 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTTCAATCAT TGGTCCATGA TTGCCTACAA 1560 

2S 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

3^ AATGCTATAC TTTCAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 1740 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGTCAT ATCCACCTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2040 

40 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACX3TTCCGGG CGTGGTCCAA TAAAACTCAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

TAATTCATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

45 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 2280 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 2340 

SO CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 2400 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 2460 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTOTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 2640 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2700 

5 

CATGCTCX5AC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2760 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2820 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2880 

10 

CAAAAftATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2940 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3000 

75 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCXJTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

20 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 3300 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

25 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 3420 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 3480 

TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3540 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3600 

TTTGTGAATT TTTATTQAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3840 

40 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4020 

45 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA AC6ATACCAG GTGCGAATGC 4080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 4140 

SO TGCGAATTTT GTACTGCAAT ATAAA6ACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4260 

AATACCCCAT AGCACACCTG CAACX3TTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4320 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 4440 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

5 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4560 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4620 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4680 

10 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4740 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 BOO 

75 TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4980 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 5040 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

25 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 5280 

ATATAACTAT TGTTACTATT AAATAATCAG CPCTGCTACC TQATAATAAA TAGAAAAGGC 5340 

30 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 5400 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 5460 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 5520 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 5640 

40 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 5760 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5820 

45 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 5940 

so TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6060 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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10 



25 



TTCCAATTGC GCAGTTGTTC AACATCATCA TCTTGTTTAA GTAATGCCAG TGGTACTTGA 6240 

AGATTAAGAC ATCGTCCTGA AATATTAAAG CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 6300 

TTATGAACAA CCGCTTCAAT TTCCTTATAA CTCAATGGCT GATACTTCAT GAGTACATCT 6360 

TGTTGAGAAA GACAAGGATA TGTACCTTGT GCAATTCTCT CTACAGAACA ACAACCACTA 6420 

TAACTTGCGA CAACCTTTTC CCATACTTGA AAATGTGCTT CGCCTAAATC TTTTGTATAC 6480 

AAATATTGTT CTGTATCACC ATGACACATT GTAATAAATG GCGCTTCTTG TCTTGTCTCA 6540 

GTAGTCCATG GCAAGCGATG TTCTTGTTGT AACGTTTCCC ACCACACACC AAATGGAACT 6600 

,5 ITATGTTGCC ATGTACTAAT TGAATATTGT GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

TTACATCCTA ACGCTTTCAA ACTTGTATAC CGATGCACAC CATCTATAAC CATATATCTA 6720 

CCATGTTGCA TCGCTGTCAC TAAAATAGGA TGACGTATAA AATCATCTGC TTCAATACTA 6780 

20 CTTTTCGTTT TTTCCAATCT TAAAGGTTCG AATGTTTCGT GAAGATCAAT CTTATCTACT 6840 

GGTACCAATT TTAAATGTTC ATGAATATGA TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6900 

GTTAAATAAA TAAATTCAGG ATGTGGATGG CTTAAGAAAT CGTGATGTGA AATAGACCAT 6960 

CCGTATGCAC CTGCATATTT GAAAACAATA ACGTCGCCTG TACTGATTGC GTCTATCTGT 7020 

ACrrCTCTAG CAAAGACATC TTTCGGTGTA CATAATTGAC CGACTAACGT TGTGTCCTGT 7080 

CTCGAAATTG AAACTTTTTC AAATGAATAT GGATTGTCCT TATAGCGATA AATGTCAAAA 7140 

GGATGGTTAT GTTGCCAAGA TACCGGCAGT CTAAATTGTT GCGTACCTCC TCTTAATATG 7200 

6CATACCAAG CACCATGTAC TTTCTTAATG TCTAGCACTT CTGTCACATA GTAACCAATA 7260 

35 TGTGCCACAA TAAAGCGCCC ACATTCAAAG TTCAATGTCA CATCTTCCAT TTCTTGCTCA 7320 

ACGATAAGTG TTTTAAAACG TTCTACAAAA TTATCCCATT CAAATTGGTT AGTTAAATCT 7380 

GCAXAGTTAA CGCCTATGCC ACCACCAAGA TTGATATGTT TGAGTGGAAA TCGATGTTTT 7440 

TCAGACCATG CCTTTGCTTT TTTAAAATAA AGTTTCACTA CATCGACATG TAAATTCGAG 7500 

TCTAAATTGT TAGAAATAGA ATGAAAATGA AATCCATCTA GATGAATCTT TGGCATTGCG 7560 

AGCGCAgcTT cAATGACATC ATCAACTTCG TCTTCAGAAA TACCAAATTG TGTTGGGCGT 7620 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT GGTCCTGCTA AATTAACACG CAATAAAATG 7680 

TGTTGTGTCT TATCTTCATC TTCTAAGATG GCATTTAGCC GTTGTAATTC ATGCATACTT 7740 

TCAACATGAA TACGCTGAAC ACCTTCACTT ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7800 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT GGTTTAAAAG CAAGACCTTT TGCTATTTCA 7860 

CCTTGAGATG CAACTTCGAA TCCTTCAACA TACTGACTAA TTGTATCTA6 GATTTTTCGT 7920 
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TGTTGCAAAT GATGTTCCAG TCCX3ACTAAA 
TGTGCTTTTA ATTGTTCAAT AACAGGTTGA 
^ GTTTAGACGT CGCTAGAGAT GCACTTAAAT 
AAATAAATGT TTGTACACCT TGTGCCTGCC 
ATGCACAAAA ATGTTTACCA TGTGCATTCA 

10 

TTACTTGATC ATCACGCGTT TGCCATGGTA 
CTTCGACTAT CATGTCTAAA CCTTCGACTT 
75 CAACATCTTC TATCATGGCA ATCACCATAA 
GTAATGGTGT ACGTCCAAAT CTTGCCATGC 
GGTAATAACG ACTTAATTTC ACAATATGCT 
CAATAATACC TCTCGCACCC ATATCCAACA 
TGACACGTAC AATTGGTATA ATATGCGCTG 
TCTCATCATT AATCGCCACG TGTTCTGTAT 

2S 

CGATAACCTC GATCATCAAT GGGTCCGGTA 
CATTGTTTAA TCTATGTTTC AGAGATAGTT 
GGATTTGTAA CATGATGAAT TCTTAACTCG 

30 

TTTTCAACTT GAATCGTAGG TTCAAACAAA 
AATGCTTCTT GATACGCCTC GATGATGCCT 
3S ATACCATATT GCTTTTCAAT AAATAAGATG 
TCATGTAAAA AGTCGCGTAC TAAACGTTCG 
ACTTFTTTAT GTGCTTCTGG CATTGGCTTT 
TGctCACGCT TAAAACGAAC ACCATCATGG 
CCATTTTCAT GAATGAGCAT CATATTTTGT 
TAAAGCATAT GAATCATTGG ACGAATCGCT 

45 

GMJCCkTKTT GTTTAATCCA ATTTTCAATG 
AGTGCATTAA ATGGTATCGC ATCCTCTTCA 
50 CATATAACAC CTAACGCACC ATAAACTTGA 
AAATAAGACT GTCCTAAGAC TTCCCCTAGA 
ATATCTTGTT GCTGTATCTG CTTTAACCAA 

55 



TCATAGATAT AATGACAAAC TGGATGAGAT 8040 

ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

ATTTTGCAAT ATCTTCATCT TCACGTGGTA 8220 

CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 8340 

GTGCTATATC GTCAATGGCC ATAACCCCTT 8400 

TATGCTCATT AGCCATCTCC ATTGCATCAA 8460 

GACCACCATT CAAACTTCTT AATCCTTGCG 8520 

CAACTGTCrC ACGATCTTTA ACGTGTGGCA 8580 

CTTTAATGAT ATCTCTATCT ATCACTGCAG 8640 

CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

CAATCACAAC AAAGTCATAC CCGCTTGCTG 8760 

TAGAATTAAA AATGCCATAA ACTGAATCAC 8820 

GTTGCATCAT TGATACCTCC TACACCTAAT 8880 

GAGTCACTTA ATAATCGACG TGTCGTTAAC 8940 

TCGAAATGTT GATAGTTATT CAACTCTGGA 9000 

TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

TCATCTGTTT CAATAAATGA ATTACTATTC 9180 

AATGTCAGGT GTGAAGCAGC TTCACTTAAA 924 0 

AAATdTTTA AGGCAATACG TGTAGGCCAA 9300 

GCATGCGATT CAAAGGCAAT ACCGTGATAA 9360 

ACAGCTAAAA ATTGCTTTGT CCAAGCTTCA 9420 

AATGGTACAC CATCCTTATC ACTTGCATAA 9480 

TCGATTAACA TATGATATAT ATTTTCACGC 9540 

GTTTGTTTAT AAGGCGAAAG TTGTGTATTT 9600 

AAAACTGTCT TTAATTCATC TTTTAAATAC 9660 

TCCGTAATTT GCGCTGCATT TTCAATTGTA 9720 
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10 



IS 



20 



2S 



30 



3S 



40 



4S 



SO 



TATTTTGTCG 
TCACTTTCCC 
ATGACATGTT 
CCAGATGCTT 
AACATTTCGT 
GCTAACCACT 
AACGTAAATC 
AATTCATAGT 
CTTTGCGTAT 
TCATTTTTAG 
TCTCCCTCAT 
TGTGTCTTTT 
ACACCGTCTT 
AGTTGGTGCA 
CTCCTTGTTA 
ATTCAATTTA 
TACACAGTCA 
CGCTACAAGT 
ATGTCCTAAG 
ATATACAACA 
CGTTgTAGAT 
TTTTAATTCA 
CAGCTTTAAT 
TAAACCACTT 
AAATAATGAC 
AACAATCATG 
ATTCGGTAAC 
CTCATCTTTG 
GTCATTCGTA 



TGTCTATTGG 
CTAACCATAG 
CAAACTGCCA 
CAATTTGCTG 
TAACTACAAC 
GCAGTTTAAC 
CTAAACGTGA 
CGTTAAATGT 
CTTTTAATTC 
GAAATGTAAA 
CTCCTACGAC 
CAGCAGTAAA 
GATATGACGC 
TCACTCTAGT 
TGACAAATTG 
CTCATCAAAT 
ACAAATACTG 
TGCCATAACA 
TGATTTACTA 
GGGCTTGATG 
AGACAAATGC 
ATTAATGTAT 
ATCGGCAATA 
TGCTCAATCA 
GCCAATACAT 
GCACTATTTG 
AATGCACGAT 
ACTGATGCGA 
CGTATAAAAT 



CX3ACATCGTA 
TACTGTGCCA 
TGGGTGTACA 
TACAAAATGT 
ATTTCTTGAT 
GTTTGGTACA 
TTTGTAACTT 
CTCAGGTGTT 
TGTCTGTAAT 
TACAACCTCT 
ACGCTCAATT 
ACGATACTCT 
TTTATACACA 
CTTTACACGA 
GATTTGGTAT 
TCGCTTTAGC 
CGTTATTCGC 
CAACTTCATT 
CAACGTAATA 
CTGCCACAAC 
CTTCAAGATC 
TTTGTACATG 
ATGTACGATT 
CTTGTGATAA 
GAATATCTTT 
TTAATAAATC 
ATCCTTCTTC 
TAACTTGCGC 
TAGTGATTTT 



COAATCGATT 
TTAAGCCTTT 
GGTATCATCT 
TCATAAGTCT 
ACCGTCGTTT 
AAATGAGGAC 
GGATGATACT 
GCTGGTGGGT 
AACTCGACZAA 
CTCAATAATT 
GGTGATGTGA 
GAATCATGTC 
ACAATATTCT 
TTAAGAATTG 
ATGTGTATAA 
CGCAATGGTC 
GTATTCTTTT 
TCTAGTCGCT 
TTTAAGACGA 
ATTTGGCACA 
TCTGACAAAG 
TGCTTCTAGA 
CAAATAACAT 
CTTAGACATC 
ATCAGCATGG 
CATTTCAGGT 
AAACATCAAT 
GGCATCAATT 
AACGTGTATC 



GTTGAGGGTG 
CTTCAGCCAA 
CAACATCATT 
TATCGCCAAC 
CTACTTTATC 
CAAATTTCAA 
GATGCCCTTC 
TTGATTCTCG 
TAAATTGTTC 
GTGTATAGTC 
TACGTATACG 
CTTCTATTGT 
CATAAATAAG 
TTTGATTCAC 
ATAGGGTTTG 
GGCGTTTGAT 
TTCCAAGTCA 
TTACCAATAG 
TGCCATGCTT 
AGCTGTTTTT 
CATACGTCGG 
CTAATGCCTG 
TCAAGCCATG 
GGTGAATCAG 
TAATTCGGTA 
TCAACTGTTT 
TTAAAATGGG 
GTCCGTTCAA 
GGTAATTTTA 



ATATAGCTCA 
ATCAACTTGG 
TACATGTTTG 
TTGTTGACGT 
TTTGTCGATA 
ATTATCACTC 
CATCGCATAA 
ATACTGCATA 
TAGCTTTTCA 



ATGAAAGCTA 
AAAATGACCXS 
TGATGATACC 
AATACX3ATAC 
CACCACAATC 
ATAAATCTTC 
TAAGACGATG 
TTGATACTAA 
CATCATGTGC 
CAGTAGCAAT 
GTATGCCATC 
TGTTACTAAA 
CrrCTGGTGC 
GCATCGTTTC 
TCCCTTCACG 
GCCCTAATGG 
GTGTTTCAAC 
TCTGTTCAAG 
AATAAATGTT 



9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT ATATTGCATA TACTGTGGAT GCTGTCGCAA 11640 

CACATTGATT TGATAAGGAT GTGTTGGTAA TAAAATAAAA TCTTTGGGTA TCTCTGATAT 11700 

5 

ATCTATGTCT GCTAATTGAT ACAACACTTT CTCAACCTGA TCTTCTTTAC CTTCTACATA 11760 

GCGCGTGAGC AGAACATCTT GATGCACAGC TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TTCGGGTGCA TATTTCTCTA AATCTGCTTC TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11880 

10 

ATGAAATGGA TGACCTAAGT ATAAAGATTG TTCTGAAACG ATATAACGAT CCTCTACGTA 11940 

GTCTATTGTG TTACTTTGCA AATAACGTGC CGTGCGATGA ATGCTATTAT CGATGTCAGA 12000 

IS CATAATTTGC GCCATATGTT GTTGCACTGC CX3TTTGATTA TCTGCACTTT GAGCCATATG 12060 

TTGCAAAATA CGCGCAATTG CTTCTTTATA AGTTGTTATT TTTTTACTTT TTCCATCGAT 12120 

AAGCCATACC TCTGGATGAT ACATATGATG CCCCATCXSCA GACCAATAGC GAAATTCACC 12180 

20 CGTTAAAGTT TCGAGCTCTG ATAATTGTAT AGACCATTGA TGATTTTGAG GTGGTACTTG 12240 

ATATAAATTT TCTTCTCTAA AATATTCATT TAAAATGCGT TCGATAGCCG CATACGCTGC 12300 

ATGTTGTATT AATTCTTTAT TTTGCACTTT TTTGTTTCAA CTCCCATAAT TTCATTAATG 12360 

25 

TGTGATCGTT GATTTGATTA GTGATGGTTG AACAAATTAA AAATAAACTA CTTACTGCAA 12420 

ATACTACGCC CATAACGATA AACGTAGTAG CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

cACTaAGACT GCCAATAATT TGACCAACAA CTAACATACT GTTCGTCGT7 CCAACAAATG 12540 

30 

TGCCTTTAAG TTGTTGATGA CACGCATTCA CGACAACAAA CATGACACTT TGAATCAATG 12600 

CACTATATGT TAATCCTTGA AGTATTCTTG CAGCCATTAA AAACTCTATA TTCGTCGCTA 12660 

35 AACCTTGCAG TATCGCACTA CAACCACATG CAATCGTGGC AAATATATAT ACTGATTTAA 12720 

CATATGATTT ATCATTAAAG CGTCCCCATA AAGGCGCGCT TAATATCGAA GCCGTCCAAA 12780 

ATGCGGACTG TAAAAATCCA ATCACACTAC GGTCATCTAT CGCTGTATGA TTCACTGATG 1284 0 

40 

AAGCAAGTGG TGATAATGCA GTTAGCATGC CATACATAGC AAAGTTTGCT AAAACGCCAA 12900 

CGATAATAAA TCGACATGTT TGTTGTGTGC ATAATAQACA TTGAAATGAA CGGCGAATAC 12960 

CTTTATTAAT ATTTGGTGTT TGTGATTTTG GCATATGTGT CGTTTCAATC AATTTTAATG 13020 

45 

CACCGAAAAT ACAGACAATA AAAGTAATAA CGGCAATACT CATCAGTAAC GCACTAAAAC 13080 

CTAATATCGA AGCTGTAACA CCX3CCAATTA ATGGCCCCAC AAGAGACCCT GCGCTGACTG 13140 

SQ AACTTTGCAG TCTTCCTAAT ACCTTTCCAC GATCTTCAGC TGGCGCCTCT GCACTCGCAA 13200 

ACGCACTTGA TGCATCAACA ACACCACCAA ATAGTCCCTG CAATAACCTC ACAAGTACAA 13260 

ACTGTAATGG TGTCGTACAC AATGCCATTA AAAATAAGCA TACCGCCAAA CCAAGTAACG 13320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13S00 

5 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

,0 CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13680 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13800 

AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGT6T ATCTTCTCTG 13920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13980 

20 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AGCTGTTTCA 14340 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 14400 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 14460 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTCX3CAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14640 

40 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

45 ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14880 

CGATCTTTTA AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14940 

TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA 
ATATCTTCAA ATAATAATGC ATCAACTAAA 

5 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA 

TCCCAATATT TTGTTGTTGT GCCTGTTGAT 
TAGCCATACC CATCGGATTA AGTAATATGA 

10 

CACCTGTCAC AAGTTGTCCT AGTTCAGCAT 
ACACCAATTG GTTAATAGTT TTCTTTTCTC 
IS CTTTGTCAGC TTTAATAAAG ACTTCTTTAT 
ATGCACCCTT TTGTAACCAA TCATATTCAA 
TGACTACTTC ACCATTTGAT ACT6CTTCTT 

20 

CCX3GACGCTG TTGTTGCCAT CTATCAACAA 
AAACAAACAC GCGTTCAATA TGATCGAATT 
CGATTAGCCC GCATCCAATG ATTGTTAAGT 

25 

CTGCAATCAC TGAAACTGCT GCAGTACGCA 
CAATTGGATA ATTCGTTTCT GGATCATTCA 
TACGTTTCGA TGGATTGTCG TGCTTACTAC 
CACCACCGAT ATGACTTGGC ATTGCAATAA 
CCtGTCTTAA ATACGGCTTA AGCX3GTTGTA 
35 CTTCTGTTAA TGCGTCCACA TAAACTTGTG 
ATCTATTTAA ATACAACATC TCTCTatTCa 
TTTTTCTAAC CATGTATCTG AATAAACTAA 

40 

AATCGTGACA ATTGTTGCAC CTTCTTCAAT 
AATCXtf^ACCT GTTGAACCTC CGGCAAATAT 
CAAAGCAGAT TGATAATCAT CTACATGGAT 

45 

TTCGGGTACA CGACTAGCAC CQATACCAGG 
AATGACTGAC CCTTTCGCAT CAACAGCAAC 
SO TTTTCTACTC ATACCCATAA TGCTACCTGT 
AGGTTGCTTA ATTGTTTCAA CAATCTCTGT 
TAACTCATTC GCATATTGAT TAATCCAATA 

SS 



GTAATCTCAC TATTCGATAC AATTCCGGCT 15240 

TCTCTTAATA TTATCGCTTG TGCTGTATTG 15300 

CCTCGCATTC TTAATATAGG TTCAATGTTG 15360 

AAATAAAATA AGCACTTGAA ATATCTTCGA 15420 

TCTCATCATC GTCTTCACGT CCTGGTATGT 15480 

GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 15S40 

GATTACATTG TGACCAGTCA TCTACTACGA 15600 

GCACATCCAT GATAGAAATG TTGCTAATAA 15660 

TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TAGCATTTTC TGTCGCAATA AAATTAATTT 15780 

AGCGTGCACA TGCTTCAGAG AATTGATCGT 15840 

GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15900 

CTTTAAATCC TTTTTTAGCC AAATGCTTTG 15960 

TACTACTAAT TAAACTTGCT TCCATAACTG 16020 

AAATAATGAC GCCACTTGCA CGCTCCATAT 16080 

CTATCCACTT AATACCTGAA ATTGCGTGTT 16140 

TTCGATCTGC GATGTGTCCA TTTTCAGGAT 16200 

CAAAATCATT GTGCGCATGG GCTGTTAATG 16260 

AATGATTACC TCCCGCTTGT TCAATATCTG 16320 

TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 16380 

ATCTAAGTAA CGATCGCCTC GATCTGGTAA 16440 

TGACGTTATC AACTGCTCAA TCGCTGCAAT 16500 

GCCTTCATAA TCAATCAGTT TTCGACAGCC 16560 

CACTTGATTA ATTTCTGATC TATTCAATAT 16620 

TAATTCTCTA TTAATAGGTT TGTCACCAAA 16680 

AATTTGTGCG TTTGGATGCA CTTCTTTTAT 16740 

CGTGCTGACT GGCGCGACAA AATAATCTAT 16800 

GCCTGCACCA TGATAATGGG ATTGCCAATT 16860 

TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17040 
AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

5 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 
TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 
CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAACCATAG GTGTTTGCCC 17280 

10 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

IS (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doilble 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG ■ 60 

25 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

30 TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 240 

TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 540 

40 

CGTACtTGTt ATAGTAGATA CCCaTilGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 660 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 720 

45 

ACTCCATGCC GCTTGTAACG CAGTAQATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 780 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 840 

SO TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTOGTG AAAACAGTTG TAATGATTGT 1020 
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10 



20 



2S 



ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCX3 ATACTGCTGT 1140 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 1200 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 1320 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAGCATTTCT 1380 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 1440 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT T6TTAATCCC GCTCTTAATA AACCGAACTT 1500 

IS ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 1560 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 1680 

GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCAmiTA ATCCTCTTGC 1740 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 1860 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

30 AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2040 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

^ ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCCCATTT 2220 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2340 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTrTTTTGTA 2400 

TTCX3TTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 2460 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

AAGTACACCA CCAGTTAACA CTTT6ATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2S80 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2640 

SO AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 2700 

ATTGTCATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 2760 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC 

5 

ATTGAATTTA CTGTTATCTG CATTGACGTC 

TAATTTAGCT TCTGTTTCAG CGATATCTTT 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC 

70 

TTGTAAATCT TGTATACTAG CATCTAATTT 
TAAAGACTTT TTAGCAACTT TGATAGTTTT 

IS AACATCTTTA GTTTGATCTG CTACTCGTTT 
AATTTGCCTT TTGAATTTGG CTACACTAGC 
CACATTAACA CCTCTCTTTC TATTGCTTAT 

20 TATTTTGTGG TTCGTATTCA TCACGTTCGC 
GCCGTTGGAT ATTTTCTTCA TAAGGCAATA 
CTTTAGGTTT ATTTTCTGTC CCAACATTTT 

25 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA 
AGTTATATTC TGTTAATGTC ATTTGCTCAA 
ACATACAAGT TATAACGATT CTGTCGTAAG 

30 

TTCCACTACT TCGACTAGGT TTCGGGTCAT 
CGAACCGAAT TCTTCTAGTC CGATATTTTC 
35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG 
CACAACCGGA TTTCCACTTT CTAAACCTAC 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT 

40 

ACTTAATTCT AATGACTTTC CGTTAATTTC 

ATTAATTTAA ACAAAATAAA mArGCTTAAC 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT 

45 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC 
AGGTAGTGTT GCAAATCCAC GTTCGAAACG 
SO ATCAATACCG TTAGCTTCTG CTTTTAATTC 
CGCTTTAAAT TTAGCGGAAT CCCCATTTTT 
TTCATACAAT ACGCGATCTA CAACTGCATC 

55 



GTTAGCAACA CCATTGTCCA CGTCTATAAT 2940 

GAGACTAGCT TTAGATACTT TTAACACTCG 3000 

AATATTGACA CGTTTCTTTT CTAATTCTGA 3060 

AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

AGCTTTTACA TTTTTGTTAC TAAAGGCATC 3240 

TTGTAAATTT TTATCGTTAG CGTTTAATTC 3300 

AAATCTTTGC ACAGACTTAA CCGCACTATC 3360 

TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

TAAATTCTGC TATAACTTTA AAGAATTCAT 3480 

TACTAAATCT TATATCTTTA CCTTCGTTAA 3540 

CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3600 

TAGTAGCTGC AGCATCACGA ATAGCAAACG 3660 

GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

TAACGTTCAA ATCTGTAATA CCAAGTGTTG 3780 

TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3840 

AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3900 

TGCGATTTCA TCTAATGCTT CATCAATGTT 3960 

AGATGTAGCT GCGATTAAAA CTTOGCCAAT 4020 

AGGCAACATT GATACACCTT GACCGATAGA 4080 

ATCGATTTCT CTTAAAAATT TAAAACCAAA 4140 

TACATTCATA ACTTAAAATC TCCATTCATA 4200 

GCCCTATTTT TATACCTCTC TTGGTGCAAC 4260 

TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4320 

AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 4380 

ACCATTCACT CCATATTCAT ATTCATATTC 4440 

AAATTTATTG TGGAAACCTT GGAAATATTT 4500 

GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4560 

TTCAATTTCA TCTGCAAAAT CGTCACCATA 4620 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4740 

AAGCATTTTA GTAGCATCTA C TTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4800 

S 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4860 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

10 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5040 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

IS CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACQCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 5280 

ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 5400 

CTGATATTGC GTGATaAATT ACC 5423 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: , linear 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

40 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 240 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

45 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

so TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 480 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 540 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 720 

TGTTGGTTTA 6TAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 

5 

AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACCAAGAAC 840 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 960 

10 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 1020 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 1080 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 1140 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA OTTTCTACTA CTOTTGTTGA 1200 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 1320 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 1380 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 144 0 

25 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 1620 

TGACTAATTG CCATGCTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 1680 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 1740 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 1860 

ATAflACGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 1920 

40 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 2040 

G77CTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

45 

CGCAATCAAT TTTAGATATT ACATCQGATT CTGTTPTTCA TAAAACTGGA ATTGCOCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2220 

so TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 2280 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2340 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 2400 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2580 

5 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATA6 CTCAATGGTA 2640 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

10 

GCTTTAGTAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

IS GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

CCAQCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3060 

GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

2S 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 3300 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

^« ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

30 

TGCCCAAAAA GTTGGTATGG CGC7UVGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 3480 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3540 

3S AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 3660 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

40 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 3840 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3900 

4S 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 3960 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4020 

SO TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4080 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 4140 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC QTAGTATTGC GTTACAATTA GCAGAAGAAG 4380 

5 

GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4440 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

10 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 4620 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4680 

IS TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4740 

ATCCGGGACA AGCAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 860 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4920 

CGTTAGCACG TTTTGGTCAA 6ACACAGATA TTGCTAATAC AGTAGCGTTC TTAGCATCAG 4980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 5040 

25 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

30 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 5340 

r 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5400 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 5460 

AAAA^CAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 5520 

40 ' 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5530 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 5640 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

45 

TAGGCTTTAC TTATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

so CGGTATTAGA ATTGACGGTT TCACGATATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 5880 

GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 5940 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 



6120 



AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 



6180 



AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCX3TC 



6240 



TATTCACTTC A 



6251 



10 



IS 



20 



25 



30 



3S 



40 



45 



SO 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dovible 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 60 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TGATACAAGC TCAAAAAGCT GGTGAAGAAA 180 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 240 

ATCAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT AGCATTCCAG ACTGAAGATA 300 

TGAAACGTCA ATCAAAAGTA TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG 360 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 420 

TGACGCTTGA AAATATtCAT CATTTGCATG AAAATGATTT AAAGCCAGAT GAAGTTGCAG 480 

CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 540 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 600 

CAGCGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

GCAATCGTCC CTTTTAATTT AACTTAQAGT TTTTTAAATT TTTAAGGAGT GAAAAAAATG 780 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 840 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 960 

AACTTACATA TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACTATGCAAG GGTTCTATGC ACCATACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

GAACAAGCAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 1140 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 1260 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

5 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACX5TT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 1440 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

10 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG CGAAAAATAT 1560 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 1620 

75 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 1740 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

20 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

25 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 2040 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

3^ TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 2160 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

35 GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 2340 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 2400 

AGAGCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 2460 

40 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 2640 

45 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCT6ATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

SO TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2940 
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CTVAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 3060 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3240 

GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 3300 

TTCTTAACTT CATTTGATGC ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA 3360 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 3420 

GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 3480 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3540 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 3600 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACGAAGT TACCGGTCAT 3660 

ATATGTAAAA AATGTGCXAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC 3720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 3780 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3840 

AAACTTTGTC AATGGTATCA CTTTAAAT6T GAGAGATAAG AATGAATTAA AGCCATTTTA 3900 

30 TGAGGACATA TTAGGATTAA ATATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3960 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4020 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TAAACTACCT CAAATTAGTG ATTTAGCTAA 4080 

TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 4140 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4200 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4260 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4320 

TATAGGTGCA TTGCATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 4380 

TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 4440 

TGGCTATTAT CAACATTTGG CCATGAATQA TTQGGTATCA 6CAACGAAAC GTGTAGAAAA 4500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4560 

SO TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 620 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 4680 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4740 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4860 
CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4920 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



75 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGOTAT 60 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

GTTTGAAAT6 GGTAGAGAGT TGCTT6TTCA AGTTCGTAA6 CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATOGTT 420 

ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATT6GT GCAAAATCTA 480 

TAGQAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 540 

CTGAAGGCGA ACGGAGAACA AC6TTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

55 TCACTAGAG6 AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

r(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
nGGAAGTGGT GTATATATTT GTAAT6ACTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 
50 AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 
GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 
TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 360 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 420 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 480 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 540 

ATTOCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCC6 CCACAAGGTG GACGCAAACA 600 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 660 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CXX5TCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

AAGCAATGAA GCTGATAAAT ATGACGAACA AGCATTATTA GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAGCTAATTT 840 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

20 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 1080 

2S 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 
(2) INFORMATION FOR SEQ ID NO: 29: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

3S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

40 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 
GAAGTAACTG CTACTCCAGA CAATATTCCA GAAOCAATCG AAGTAGACAT TACTGAATTA 180 
AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 240 

45 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAAGC AACTGAAGAA 300 
GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 
50 AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 420 
TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 480 
AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 540 
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CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT 660 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 

5 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAT ATCGGCTTTG AAGTCGTTGA 780 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC 840 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

10 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 

TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

IS TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 1080 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 1140 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 1200 

20 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 1260 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 1320 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCCC GTCAGCTAAA 1380 

2S 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTACCAAT 1440 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 

30 TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 1560 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATOTG GCAAAATCAC 1680 

^ CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 1740 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGCATATTG GTGAATTCTC ATTGCGAGGA 1800 

GGTMTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 1860 

40 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 1980 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 2040 

45 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCGCTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 2160 

50 TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AGCAATATTA TTGAAAGTGG TAATGGATTT 2280 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2340 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATGCGTTC TGAATTTCAA 2460 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAACCGAAAC TAAAGTTGAA 2520 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT ATTCCATCAA TAACAAAATT GCATCGCTCA 2580 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 2640 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA GAAAAAGCAA 2700 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 2760 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2820 

IS CTCGAAGTGG 6GCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 2860 

CTATTTGTTC CAGTA6ATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC GGAAGATAAG 2940 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTGAA 3000 

CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3060 

GAAGGTTATC AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 3120 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 3180 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 3240 

GTGAGAGCAG CATTCAAAGC TGTAATGGAA GGAAAGCAGG TTGCATTTTT AGTTCCTACA 3300 

ACTATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 3360 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 3420 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 3480 

3S CAGTATAAAG ATTTAGGGCT OTTGATTGTA GATGAAGAAC AACXyVTTTGG TGTACGCCAT 3540 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGTCAGT GATTGAAACG 3660 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT GAGTTTTATC 3720 

AAAGAAGCTT TAGAAAGAGA ACTATCCCX5T GATGGCCAAG TGTTTTATCT TTATAATAAA 3780 

GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATGCCAGA TGCTAACATT 3840 

6CAGTTGCTC ATGGACAAAT GACAGAGCGC GATTTAGAAG AAACGATGTT AAGTTTTATC 3900 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT C6ATGTCCCA 3960 

SO AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTATCAA 4020 

TTAAGAGGTC GTGTTGGTCX3 TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 4080 

AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 4140 
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TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4260 

TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4320 

5 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 4380 

GCTAAAATTG AA 4392 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
IS (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

20 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 60 
GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 120 
TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 180 

25 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 240 
ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 

^ CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 360 
TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 420 
TTGCAGTTTG ATAACCGATT TCTAAATTTT TTACATGCAT GACGTCATTA CCTGTATTCC 4 80 

35 GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 540 
CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 
TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

40 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 720 
AGCGTTTGA 729 
(2) INFORMATION FOR SEQ ID NO: 31: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
SO (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT 
TCCTTACTAT CTTTAGCTTC AGATTCCTGT 
TGTCCTTCAA TATCAACTCG TGGAATAATG 
CCTTTtCCAA ACAATTTCGt TAATGCAGGA 
AAGAGTACAC CAAACGCTAA TGCCATACCC 
ACAAACGCAA AGAAGACACT AAACATAATT 
TCTTTCAATC CTACTTTGAT AGAATAATCA 
ATTCGCGACA TAAGGAAGAC TTCATAATCC 
ATAACCGGTA AAAATGCTAG CATTGGTCCT 
AAACCATCTT GCATTACTAA TGTTGTAAAT 
CCTAAAACTG CTTTTAATGG TATTA6AATT 
GCTAATACAA CAATGACTGA GGCAAATAAA 
TCAATATTAA TGACACTTTG TCCCGAAATC 
TCTTTATGAT AATCTCGTAA ATCATGCACT 
TGCTTAGGTA TCACGACCAT CAAAGCGTAA 
ACGATATCTA CATTTTTCTT ATCTTTT^TA 
AATCCTTGTG GATCATCCTT TTTATCTTTC 
AATCCTTCAC CAAATTTATC CGAGATAATA 
GGTTTAACAC CGTCATCTGG AATACCAAGT 
ACTAATATGA TTAAACCTAG TAATACTGCC 
CATC^CGTAT CAATATCTTT TTTGAATTTA 
TGGAAAATGC TTATTAATGC AGGTAATAAA 
CTAATTGCCG AAGCAAATCC CATTACCGCT 
CATACTGCAA TTACAACTGT TACACCAGCA 
GCAAGACCAA TGCCTTTAAT GTAATCTGTT 
AAAATAAATA ATGCATAATC GATACCAACT 
GTQACATTTG GTATATCGAA TGCATAAGTT 
AGACCAATCA ATGCACTTAT AATTGGTAAT 
AACAGTACAA CAAATGCAAC AATAATACCA 



GGATATACTT TAAAGGTTGT GTCGTATGTT 
GATTCAACCG TTTTATATTT TTCAAGTGCA 
CGATTCAACC ATGCTGGTAA ATACCACGAA 
ATTAACATCA TtCTGACTAC GAAGGCATCA 
ATTGATTTAA TCATGACATC TTCTTGGAAT 
AATGCAGCTG CTACAATAAC AGGACCGCTT 
TTATCCCCTG TTTTACTATm yyCTTCATGr 
ATCGCTAATC CAAATAAGAT ACCTATAGTA 
GTCGTTTCAA TACCAAACAG ACCTTTCATA 
CCTAATGTTG CCATTAATGA CAAGACGAAT 
GAACGGAAGA CAATCATTAA TAAGAAAAAT 
GGTATCGCCT CATTTAACTT TTTAGACATA 
TCCGTTTTGA ACCCATATTT ATCTTGTGCA 
AAATCATTTG TACTCTCTGC ATTAGGCCCT 
TCATTATCTT TACTCATTTG TGGTGGCGTA 
TCTTTATATA CAGACTGTAA ATCTTGTTGT 
ACATTTATCA ACATCGGTAT TTGGCCATTA 
TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 
CGCATATGAC TAACTGGTAT TGCAGCTGCT 
GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 
GACTGTAATT TATTCACTTT AATGCGTTtA 
GTTAAAGCGC TAAGTACTGC AAAAACAACA 
AAGAAGTCAA TGCCTACTAA TGATAAACCA 
AAAACAACTG CACTACCTGC TGTTCCTATT 
TCAGTTTTCA TAACTTGTCG ATATCTGAAT 
GCTAGTCCAA TCATTACGGC TAATGTCAGT 
AACAAACTGA TAATACCTAC ACCAGAGGCT 
CCTGCAGCAA TQACTGAACC GAATGTGATT 
ACTAGTTCAG AATTACCGCC TACTTCTGTA 
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AAATGACTTT TAACATTATC TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT 1980 

5 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 204 0 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

10 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGTAGC AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TATCCATOCA ATGACCGCGG ACCATTTATG TTTTGCGATG 2280 

IS AATGTCCCCA TCTTATATAA AAATTTTGCC AJVAGTATATT GCCTCCTTTT AAAATCAACG 2340 

TTATAGTTTA AATATACAST GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 2400 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATQA 2460 

20 

AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 2580 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 2640 

25 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AATCATCCAT 2700 

TCCAAACGAT GAGTGATACX3 ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 2760 

AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATTATGCATA 2820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2880 

TATTTTATAT TTATGACTCG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 2940 

3S TTGATTGGCC TGGCGAAGAT ATTGATAACA TTTTCCATA6 ATTAATCAAT ATTAAGATTA 3000 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 3060 

CGTTAATGAA TTCGTGTACT ACTATCGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 3120 

40 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT CCCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTTATACC TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 3300 

45 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 3420 

SO TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 34 80 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATTCGTTTA CCGCTTTAAT 3540 

CCAATGCTCG CGATTATTTG TAATCACTGC ACCAGTTAAA CCGTAATCTG TATCATTTGC 3600 
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10 



IS 



20 



25 



30 



35 



40 



45 



SO 



55 



ITCTTCri'GC 


ATGATTCTAT 


CTTTAGATTT AAGTCCTGAA ATGATTGTTG 


GTTCTACAAA 


3720 


GTAACCTTTT 


GAATCATCAG 


TGCCGCCACC 


TTGTTCTAAT 


TTACCTTCTT 


CTTTACCAAT 


3780 


CTCAATATAA 


TTTTTAATCT 


TATCAAATTG 


TTTTTTATTA 


ATAACTGGGC 


CCATATACGT 


3840 


ATTGTCTACA 


GTATTGCCCA 


ACGTTAATTC 


TTTTGTTAAT 


TTGATTGATT 


TCTCTAATAC 


3900 


TTCGTCATAA ACGTCTTTAT 


GCACAATTGC ACGTGAACAT GCTCAACATT 


TTTGACCAGA 


3960 


AAAACCAAAT 


GCTGACGTTA 


CAATAGCTTC 


TGCTGCCATA TCT6TATCAA TATTTTCATC 


4020 


AACTACAATG 


GCATCTTTAC 


CACCCATTTC 


AGCGATAACA 


CGTTTCAAGA AGTTTTGACC 


4080 


TTCTTGAACA ACGGCACTAC 


GTTCATAAAT TCTAGTACCT 


gtcxk:acgtg 


ATCCTGTAAA 


4140 


TGTAACGAAA TGCGTATCTT 


TATGATCAAC 


TAAGTAATCA 


CCAATTTCTT 


TCGGATCACC 


4200 


AGGAACAAAG 


TTAACTACGC 


CTTTTGGTAA TCCTGCTTCT 


TCTAAAATTT 


CCATTAATTT 


4260 


ATAAGCGATA 


TAAGGTGTAT 


CCTCAGCAGG 


TTTCAATAAC 


actgtattac 


CTGCCACAAC 


4320 


TGGTGCTAAA 


GTTGTACCAG 


CCATAATCGC 


AAACGGGAAG 


ttccacggcg 


GAATTGTAAC 


4380 


ACCTGTACCA ATTGATTTAT 


AGAAATATTT 


ATTGTGTTCA 


CCTTCACGAT 


CAAGTACTGG 


4440 


CTTACCTTGA 


GCCAAGTCCA 


TCATTGAACG 


TGCATAGTAT 


TCAATAAAAT 


CAATAGCTTC 


4500 


AGCTGCATCA 


CCAACTGCTT 


CATCCCATGG 


CTTACCTGCT 


TCATAAACCA 


TAATTGCTGC 


4560 


AATTTCCGCT 


TTTCX5ACGAC 


GAATAATTGC 


CGAAACACGT AACATAAGCT 


CTGCACGATC 


4620 


ATTTGCTGAC 


CATGTTTTCC AAGATTTATA AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 


4680 


AACATCTTGT 


TTTGTTGCCT 


TTGATGCATT 


TGCAATCACT 


TGTGATGTGT 


CTGCAGGATT 


4740 


GATTGATTTA ATTTTGTCAT 


CTTTGAAAAT 


CTTCTCTCCA TTAATCACTA ATGGTATGTC 


4800 


TTGACCTAAT 


TCTTTTTCCA 


CGTCTTTCAA 


TGCTTTCTTA AACATATCCA 


CATTTTCTTG 


4860 


GACTGAAAAA 


TCGTAACCAG 


GTTCATTTTT 


AAATTCTACT 


ACCATGTACA 


CTTACCCCCT 


4920 


ATAAATTTTG 


AAAGTGGTTT 


AACCCTTTGA 


TTTAATGATA 


TAACATCATT 


TAAACTCATT 


4980 


TTACTATGAT 


TAAGGTTAGT 


TTTGCAATCG 


CTTTCATTTT 


TATGTTTTAT 


CACTTATTCT 


5040 


CAAGTATTTT GAAATTGATT 


GGTTACTTTT 


TAAAATTTAT ATGGGTCGCA ACTGCTACTT 


5100 


TATCGTTTCG 


TCATTTAATG 


TTTCGGATGG 


TAGGTCATTA 


TCAATTTTAC 


QAACGACTTT 


5160 


ACAAGGGTTT 


CCAACCGCTA 


AGCTGTGTGG 


CGGAATATCT 


TTAGTGACAA 


CACTACCAGC 


5220 


ACCAATCACA 


CTGCCTTCTC 


CAATCGTCAC CCCTGGTAAC ACGGCTACAT 


GACCGCCAAA 


5280 


CCAAGTATTA 


CTGCCAATAT 


GAATGGGTCC GGCTTTTTCA AAACCTTCAT 


TTCTATGATG 


5340 


GAAATTAAGT 


GGATGTGTCG 


CTGTGTAGAA TCCACAATTA GGTCCTATAA AAACATTATC 


5400 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT 


ATCAAAAGGA ATCGAAATAC 


TTACATTGTC 


5520 




TGTTGTTGTT 


TGAAATAATT GATCAATTAA 


TTCCTTTCTT TTATTTGTAG 


CACTCGGTCT 


5580 


5 


TGTATGATTT 


AATTCAAAGC AAATATCTTT 


CGCTCGTGCA CGTTCATTGA 


TTAAGTATTG 


5640 




ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG 


TCATTACACC 


5700 


10 


TTTCAACTCC 


TAATAACTTA TTTACTTGTT 


TAAAAGTTAA TCAAATAAAC 


CTTCGCCTAT 


5760 




GCAACTAATA 


CGCTATAACA TTATGAAATC 


ATGACCTTAT CACCCTTATC 


TATACAATTC 


5820 




TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA 


TTCATTTTTT 


5880 


IS, 


AATCTATTAA CGTACAAT6T GAGTAAGAGA AATATAAAGG AGTATGATAG 


CGATGAGAAT 


5940 




ATTAATTACA 


GGCACAGTTG CTATCTTAAT 


CATTCTAGGT TTGGTCAAAA 


CGATACAAGA 


6000 




TTACGAAATG 


ACAAACGACA CGAGTCGTcA 


GTTGTCAGAC AACAAAGATG 


ATGATAAAGT 


6060 


20 


CATCCATCTT 


AATAATTTTA AAAATTTACA 


TGCGAAAGAA TTTAACCCAT 


CTGATTTCTT 


6120 




TTAAGTCACC 


TAAGAATTGC AAATCCAGAA 


GTCATTTAAG TTTTACCTTT 


CATTCATACA 


6180 


25 


TCCTTTAATA 


TTAATTACGA CTTCTTTTAT 


ATAGATGCTA AGTAGAGA6A 


TTGTTGTGCA 


6240 


ATGTTTGCAC 


GGCAATCTCT CTTTTTCTTT 


TTAAAATTGG TAAAAGTAAA 


ACGCAACGAT 


6300 




TGACTTATAT 


ACCTATAGGG GGTACATTAG 


ACGTGTAACA ATGAATCACA 


GGGAGGCAAT 


6360 


30 


AATGTGGCTA 


ATACGAAAAA AACAACATTA 


GATATCACTG GTATGACTTG 


TGCCGCATGT 


6420 




TCAAATCGTA 


TCGAAAAG/^ ACTGAATAAA 


CTTGATGACG TTAATGCCCA 


AGTGAATTTA 


6480 




ACTACAGAGA 


AAGCAACTGT TGAGTATAAC 


CCTGATCAAC ATGATGTCCA 


AGAATTTATT 


6540 


35 


AATACGATTC 


AACATTTAGG TTACGGTGTC 


GCTGTAGAAA CTGTCGAATT 


AGACATTACA 


6600 




GGTATGACTT 


GTGCTGCATG CTCAAGCCGT 


ATTGAAAAAG TGTTAAATAA 


AATGGACGGC 


6660 


40 


GITCAAAATG 


CAACGGTCAA TTTAACAACA 


GAGCAAGCTA AAGTTGACTA 


TTATCCTGAA 


6720 


GAAACAGATG 


CTGATAAACT TGTCACTCGC 


ATTCAAAAAT TAGGTTATGA 


CGCGTCTATT 


6780 




AAAGATAACA 


ATAAAGATCA AACGTCACGC 


AAAGCTGAAG CGCTACAACA 


TAAATTGATT 


6840 


45 


AAGCTTATCA 


TATCAGCAGT ATTATCTTTA 


CCACTATTAA TGTTAATGTT 


TGTACATCTT 


6900 




TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT 


TTTAGCTACA 


6960 




CCTGTACAAT 


TTATTATTGG ATGGCAATTT 


TATGTAGGTG CTTATAAAAA 


CTTAAGAAAT 


7020 


SO 


GGTGGCGCCA ATATGGATGT ACTTGTTGCT 


GTTGGTACAA GTGCAGCATA 


TTTTTACAGT 


7080 




ATTTATGAAA 


TGGTTCGTTG GCTAAATGGC 


TCAACAACGC AACCGCATTT 


ATACTTTGAA 


7140 




ACAAGCGCCG 


TACTAATTAC CTTAATCTTA 


TTCGGTAAGT ATTTAGAAGC 


TAGAGCX^G 


7200 
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TTAAAAGATG GTAATGAAGT GATGATTCCT 
ATCGTTAAAC CAGGTGAAAA GATACCTGTT 
^ ATCGACGAAT CTATGTTAAC AGGTGAATCT 

GTAATTGGTT CAACGATGAA CAAAAACGGT 
GGGGACACTG CGTTGGCAAA TATTATTAAA 

10 

CCGATTCAAC GATTGGCAGA TATTATTTCT 
GCACTATTAA CATTTATCGT GTGGATTACT 
^5 CTTGTTGCGA 6TATTTCCGT TCTCGTCATT 
CCAACTTCTA TTATGGTAGG TACTGGTCGC 
GGCGAGTTTG TTGAACGCAC ACATCAAATT 
ATTACAAATG GTCGTCCAGT CGTGACAGAT 
CTTGCTACTG CTGAAAAAGA TTCTGAACAC 
AAAGAAAAGC AATTAATATT AACTGAGACA 

2S 

ATTGAAGCAA CGATTGATCA TCACCATATA 
AATGATATTA GCTTGCCTAA GCATATTTCT 
AAAACTGCTA TGCTCATTGC TGTTAATTAT 

30 

ACTGTCAAAG ATCATGCCAA AGATGCTATA 
GCCATGTTAA CTGGCGATAA TAAAAACACT 

35 GATACTGTTA TTGCAGATAT TTTACCAGAA 
CAACAAGGTA AGAAGGTTGC GATGGTTGGT 
AAAGCTGATA TCGGTATCGC CATTGGTACA 

^ ATTACTATTC TTGGTGGCGA CTTGATGCTT 
ACCATTCGTA ATATTCGTCA AAATCTATTT 
CCTATAGCTG CATTGGGCTT ACTTGCGCCA 

45 

TCAGTAAGTG TTGTCACAAA OGCACTTAGA 
AAAGATGCCT AGATTCCTTA ATAATGAAGG 
ATTGGCTCTA TAATGTCGCG GTTTAyaGTt 
CACTTTTCGC TTGGCGAATT AGTGTATCTT 
ATTATTATAA ATAATAAGTA CACTACGGtT 

55 



CTAAATGAAG 


TACATGTTGG AGATACACTT 


7320 


GATGGCAAAA 


TTATTAAAGG TATGACTGCC 


7380 


ATCCCTGTTG 


AGAAGAATGT TGATGATACT 


7440 


ACTATTACTA 


TGACAGCAAC AAAAGTTGGC 


7500 


GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 


7560 


GGTTATTTCG 


TTCCTATCGT TGTTGGTATC 


7620 


TTAGTTACAC 


CAGGTACATT TGAACCTGCA 


7680 


GCTTGTCCAT 


GCGCATTGGG ACTTGCTACA 


7740 


GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 


7800 


GATACCATCG 


TTTTAGATAA GACGGGTACC 


7860 


TATCATGGTG ACAATCAAAC GCTACAACTA 


7920 


CCATTGGCAG 


AAGCCATTGT CAATTATGCA 


7980 


ACAACATTTA AAGCAGTACC TGGCCATGGT 


8040 


TTGGTTGGTA ACCGTAAATT AATGGCTGAC 


8100 


GATGATTTAA 


CACATTATGA ACGAGATGGT 


8160 


TCATTAACTG 


GTATCATCGC AGTGGCAGAT 


8220 


AAACAATTGC 


ATGATATGGG CATTGAAGTT 


8280 


GCTCAAGCCA TTGCAAAACA AGTAGGCATA 


8340 


GAAAAAGCTG 


CACAAATTGC QAAACTACAG 


8400 


GACGGTGTAA 


ATGATGCACC TGCATTAGIT 


8460 


GGTACAGAAG 


TTGCCATTGA AGCAGCTGAT 


8520 


ATTCCTAAAG 


CCATTTATGC AAGTAAAGCA 


8580 


TGGGCATTCG 


GCTATAATAT TGCCGGTATC 


8640 


TGGGTTGCTG 


GTGCTGCAAT GGCACTAAGT 


8700 


TTGAAAAAGA 


TGCGATTAGA ACCACGCCGT 


8760 


ATTCGTTGGT 


GATTCTGAGA TAGGCTAGTG 


8820 


GGATCTTCGC 


TCCAACTGCA TATATAGTnA 


8880 


ACCTAATAGc 


TCCGCCTATT AGGTTCCATC 


8940 


TACAGTTGGA 


TCTTCGCTCC AACTGCATAA 


9000 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

5 

GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACX5CAAT TGAAGATCAA 9240 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 9300 

AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 9360 

10 

AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 9420 

TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 9480 

IS TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 9540 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 9600 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 9660 

20 

CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 9780 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGtCGTA TGCGACCACA TCACTTTGAT 9840 

2S 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 9960 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 1002 0 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 10080 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

35 CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 10200 

AGCCATCATA ACCAGCGACA CCTTCAACAT TCTCATCAGT TAATGCTTCT TTAGTAATAT 10260 

CTACerCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 10320 

40 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 10380 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10500 

45 

OACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10560 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

SO GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 10680 

TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 10740 

TGCGTTTGGT ACTITTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10800 
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CCATTTTCCG AAGCAAGTAT ACTAGGATTT 

AAATCATGCA CGATTTTAGT GTCTGTACCT 

5 

TCTTTCGTAG CTGTCGATCC AGTTGGATTA 

TTATCTATTA TTTGTGAATC AACTTTGGAC 

TTAAGCGGGA CTGGCTTGCC ATCAGCTAAA 

10 

GGATCAGGTA GTAATACATA GTCTCCTGGA 
CCATTTTTTG TACCATATAA AATGCATACT 
75 TGTCTTTGAT AAAAATCTAC AATAGCTTGC 
TATTTTTGAT TTTCAGGAAT AGTTAGTGCT 
GTGGGCCCAT CAGGGATTCC AACTGCCATA 
TTACGTCCCA TCGTTTTCCC GAAATAACTA 
ATCAAATTCC TCCTCTATCA TTAAACATAG 
GTATCACTCT CATTTAGATG GTTACAATGA 

2S 

CTTATGACAC ACGTTGTATT GAATGAATTT 
GTCAATATTA GGAATTTTCA GATTAATATG 
ATATCCGCAT GCGCAACCAG TTAGATATGC 

30 

GTATTCAAAC GTGAACCTTA ACAGGCGTCA 
TACTTATTTC ACTATGCCTT TTACGTTCCC 
35 CTACTCCCTT ATACGCCCCG CTCAATATCT 
CTCGTTAAGA CAATAGGAAC GCCTGCACCT 

TCTTTATAAT CTCGCX5ATAC ATTTTGTGGA 

40 

ATTAAACCGA ATGCCGAACC AAATTTCGCA 

AAGATTGTTT CTGAAACAAT ATGCGATTTT 

TTACGATAAA TAATTTCCTT TATTTGTTGC 

45 

CTACCTGTTT TAAGTTCCGG CGTCGGCATT 
GCAAGTGATT TATCAGCGAC CGCTGGTACA 
so CGTCCCTCAA ATATTTCTTC AATATTGCCT 
AGTCTCACTT GATCTGTCAC ATCAATATCT 
GAGTAATCTA AGTCTGCAAT TTTATGTGGT 

55 



TTAGCGTCGA AACCGAAAGC ACCATAAGCA 10920 

TTAAATTTAG cTATCGCTTC ATCAAAAACT 10980 

TTTGGATACG TTAAATAAAT GAGTTTTGTT 11040 

CAATCTGGCA AATAATGTGG CGGTTCTAAA 11100 

AGTACACCTG CTAAATAATC CGTGTAGCCT 11160 

TTGATAACAC ATGTTGGTAC TGCCACTAAT 11220 

TCATCTTCTT TATCTAACGT CACATTATAT 11280 

TTGAACX;CTT CTTTACCATG AAAAGCACCA 11340 

TTTTGAAAAT GATCAATAAT ACCTTGTGGC 11400 

TTAATTAATG GCAATGGTCC ATGTTCX3ATT 11460 

TCAGGGATAT TTGCTAATTT GTTAGAGATC 11520 

CCTGGGCGAC TATCATAATC CTAACAACTT 11580 

CATCGCCATT CACCGTTATG TTCAACAGAA 11640 

ATTTTCATTT TAGGTAGGTA TAATATTATT 11700 

CACTCAATCG TTATGATTTA ACTGTCATGC 1X760 

TTATATAAAG TATAACGCCC ATCAAGGTAC 11820 

TTCATTGTTA AATAAAACTT CTTAAGCACA 11880 

CTTATACTTT TCTCACATCT TTCTCTTAGA 11940 

TTAATCATTT CATCTACAGT TATTTTCXKy^ 12000 

GGATGCGTAC TTGCACCTGC AAAATATAAA 12060 

CGATAATAAT TACTTTGCGC TAAAGTTGGC 12120 

TGATACGTTT GCTCAAAATC ATTTGGCGTA 12180 

ATATCTTCAA ATACTTCAAT CGTTGCTAAT 12240 

GTCAAAGCTT CATCTGACCA ATCGATTCCG 12300 

AGCACATAAA TACCAGTTTT GCCTTCTGGC 12360 

TACACATAAA TAGAAGGATC ATATGATAAA 12420 

CTAAAGTCAT CTGAAAAAAT AACATTATGA 12480 

ATACCGATAT ACATTAAAAA TGCTGAACAA 12540 

GGATACTTTT TAATAGGTGC AAAATCTGGC 12600 
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10 



IS 



20 



2S 



30 



3S 



40 



45 



ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGCCCT 12780 

TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA 12840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12900 

AACGCTAAAA GCTTTTGTAT CTTTTC6TTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 13140 

T6TAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCXXTTGCTA ATCCTGTGAC ACCTGCACCA 13440 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 

CCTGTCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 

TAGCTGCATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 

CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT TU^TATATC CGTTCATTGT 13800 

aUKAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



so 



ss 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 

AAATGTGACA AACTTAGAAC TAATATCAAG T6TTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA 
AATATGAAAG ATTATGGGTT AACAGGCATA 

^ CGTGCGTTAA ATCGTGGAAG ATGTAAACCA 

GATATTTGCA AACCATTAAC GATATATGGC 
ATTTTACGCC GATGTCATTC TGGTCCTTTA 

10 

CGTGGTTATA ATGGACACAO TCATATTCAT 
GTATCGTATC CTTATAACAA TACAGCTATG 

,5 ATAGGTGTGA CCATTAAGAA TGTAGTGAGT 

GGACTCTATA TTAAAAGCTG TTCATTTGAA 
ATTCTGAAGC AATACAATTA GACATTCAAG 

2^ CAGATGGTAC GATAACGAAA AATGTCATTA 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA 
ACTATGAGAA TATTCATATT AGAAATAATA 

2S 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA 
GCATTAGATA TTTAGGAGTT AGAGATGGTA 
ACTTAGGTTC CCAAGCAGGC ATAAATATGA 

30 

T6TCTAAAGA TGCGATACAT GTACGTAATT 
TCGTTGGGAA TACATTCAAT AATTCGACTC 
35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA 
ATTGATAAAA CX5GTATAAAT ATGCTATAAT 

40 

TGACGGTAAT GATAATACAA GATAGACAAC 
AGCTTGTCAT AATCATCATG AGGGGGAAAT 
TGATATCGAA AAGGTATTTA ACATTCTTTT 

45 

CGGGACAACT TATTGGACTA ATATTAGGTC 
TTCATCCACA AGACTTACCT TGGAAAGGCG 
5P CGACTTGGTG GATTACTGAA GCAATTCCTA 

TATTACCATT AGGTCATATA CTTACACCAG 
TTATCTTTTT GTTTTTAGGT GGATTTATTT 

55 



ATAGACAAAG GATGACGATT TATGTATATC 300 

AACAAAACTA AAGATACTCG AGCAATACAA 360 

ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 540 

ATTAAAGGCG GCAAGTTTGA TATGAATGGT 600 

TGCATTGGGC ATGCTQAAGA TATTCAATTA 660 

GGTCATGCAA TTGATGCTTG TGGGATTAAC 720 

GGATTCATAG ACTATAGT6G CGAACcTTTT 780 

TACCTGGTGC TTTTCCAAAA TTCGGAACgA 840 

TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 

TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

TATTTGAAGA TATACAAGGT TATGCATTAA 1020 

TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

AAAATGCAGC AGATGTGaTG ACAGGaAAAG 1140 

ATATAATTGG AAATGAATTT AAAGGATCAA 1200 

ATAATAATGT TAAACATAAA GATGTATTAA 1260 

AATCAATTCA TTTAGAAGAT ATTGATACAG 1320 

AAGTTACTAC AATCAATGTA GATGAAATAA 1380 

TTAAGAATAG TAGATAATTT TTGAAAGOGC 1440 

AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

TTTCTATACT CTAATATAGT GAGTTGAAGT 1560 

TTATGGCTTA TTTCAATCAA CATCAATCAA 1620 

CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 1680 

CATTACTTTT CCTATTAACA TTATTATTCT 1740 

TCTATGTTTT AGCGATTACT TTATGGATTG 1800 

TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

AACAAGTATC ATCCGAATAT GGCAATGATA 1920 

TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 



316 



10 



IS 



EP 0 786 519 A2 

TTGGATTCAT GGTGGCAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAGCTGTAA 2100 

TGATTATGAT TCCGATTGGT TTAGCAATTA TTAAGGAAGC ACATGATTTA CAAGAAGCCA 2160 

ATACGAATCA AACAAGTATT CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTACATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 22 BO 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 2340 

TTCCAACGGT CATTGTTTTG TTAGGTATTA CrTOGCTCTA TTTAAGATAT GTTGCGTTTA 2400 

GACATGATTT GAAATATTTa CCTGGTGGTC AGACOTTAAT TAAACAAAAG TTAGACGAGC 2460 

TTGGCAAAAT 6AAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 2520 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2580 

GTACGATTGC TATTTTTATA TCAATATTAT TATITATTAT TCCAGCTAAA AATACTGAAA 2640 

20 AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 2700 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG GTATTTCTGA AA6TGGTTTA GCAAAATGGT 2760 

TAGGOGAACA GTTGAAATCA TTAAATGGTG TTAGTCCGAT TCTTATTGTA ATTGTCATAA 2820 

CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 2880 

TACCGATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TCCATTACTA CTTATGGCAC 2940 

CTGCAGCTAT GGCGGCTAAC TGTGCATACA TGTTACCAGT AGGGACACCA CCGAATGCAA 3000 

TTATCTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3060 

ACTTAATCAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 3120 

35 GTATTGATAT AAATCAACCA CTGCCATTGA AATAGTAATT GCAGATTAGA ACGAAAAATA 3180 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 3240 

TAASAACAAT ACAAACAAAA GAAAGTCAAC CAAGGATGGA TTCCTATTTT AATCCTTGGT 3300 

40 TGAOTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 3360 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 3420 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGATACATAC CTAATTGTGT TTTTAAAGGA 3480 

CGAGACTCAA TCCAGGATAA AITAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 3540 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3600 

TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 3660 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATT6CT 3720 

GCAACGCCAT TTTCAATTTT AGTCAAACTT TGAATTGTAC TGTCGACATA ATCATAGTCA 3780 

55 



25 



30 



45 



SO 



317 



EP0 786 519 A2 



TTTTTAATAT CAGAAATGGA ATCTGTTCCA 
CGTATTTCAC CGTGTGCAAA GACATCTTGC 

5 

GTTCCTTCTA TAGAATTTTC AATAGGGACA 
GCCTTGATGA CTTCAAATAA ATTTGACTTT 
TACT6ACGAC AAGCCAAATA TGAAAATGTA 

10 

TGCTACACCT CTACTAACTT AATGATGGAA 
TTTATAGAAA AAGTTTGGAT CTTTTACTGT 
IS CAATGTTGGA GATAATGGCG GTGCTAGCCA 
TTGTGCTTCG TTACXnTCGA ATAGTTCGAA 

TGATACGCCT TCTTTTTTAA AGGAATGATA 

20 - 

ATCTTTATTA AATGAATTAT TTTTAAGTGT 

TAGTAAATTG TAACGGTAAT CATCAATAAA 

CCAACCGTTA AAGGGTGCAG TTGGATATAC 

2S 

AATGATAGGG ACTGCATACC ATTTTAAGTT 
TTCAATAOGT ACTTCTTTAA TTAATGAAGT 
AGGTAATTGG TAAATCAGTG GTAACACGTC 
GTGATTTGCT AAGCGTGTAA CTTCTTTTTC 
ATAGCGAATT AATTGATTGT TGAAAATTTT 
55 ATACGGCTTT AATTTACCTT CATTTGTAGC 
GAACGATGCT TCGTCAGTAA CATCTCTTGC 
ACGACCAATG CAACGATTTG AATTACGCCA 

40 

TTCTGTATGT GTATATGTCC CAGTTTCTTT 
ATTGATAATT TGCGTTTCAT AATGACACTC 
CTCTTTAAAT AACATTAACA ACACCTCX5CT 

45 

CTTAAAAATT ATGTATATGT CATTAAATTG 
TAAGGGGCTC TTATGTATAT AAAAAAATGA 
50 TCGATTGGAG AGATACAAGT GTACCAATTA 
TTATATCAAA TAAATATGGC TGAAAGTTAT 
GTATTTGATT TGTATTTTAG AAAAATGCCA 

55 



TTACCATATA ATGCAAAGTT AATATCTAAA 


3900 


TGTGCAAGTG 


CATCTGCCAC AATGTTGATT 


3960 


ACACCAATCG 


ATGTGTCATC ATCTGCAACT 


4020 


GGTTGAAAAG 


TTGCTTCATT TTCAGAAAAA 


4080 


CCTTTAGGGC 


CTAAATAATA TAATTGCATA 


4140 


AGGGCACTGG 


TTAGCATTTG ATTCTTTCTT 


4200 


ATTGTCATAT 


CCGTGATGAT AATTTGACGT 


4260 


AGACCATTTT 


CCGGTAACTT GACGACCTTG 


4320 


TTOCTTTGCA OCGGTCAAAT GATOGACAAT 


4380 


CACAGCATAG 


TTCAATTCAA CAAGTGCTCX5 


4440 


ATCAAATTCA 


AACGCATCTG CAACTTTTTC 


4500 


GTTACGTACG 


CCAATTTCAG TTACCATATA 


4560 


AATGCCACCG 


ATTTTTAAGT CCATATTGGA 


4620 


CAATTLTTCIT AATTTTGGAT AATGATTATG 


4680 


AGGATATTCG 


TAAAATTTAA CTGACTCATT 


4740 


AAAATTAGTA 


CCTTTTCCTT TCCAACCTAA 


4800 


AGCAGGATCA 


CCACAATTGT CATAGCCAGC 


4860 


AGGTCCATCC 


TTTGGAGCAT ATATAGTAAT 


4920 


CTGTGTAATA 


TGATAAGTAA TTGATGATAA 


4980 


ATCAATQACA 


TTTAACGAAT CCCAAAATAA 


5040 


AGCCATTTTA 


GCACCATAAA TAAGTTCTTC 


5100 


TATTTCTAGT 


TCAATGTCAT GTAAACGTTT 


5160 


TTTATACATG 


TTTTCTATGA AAGCTTGAGC 


5220 


TTATATTATA GTCTACATTA TTAAAATACT 


5280 


TTGGTTGATT 


TTAATTAAAA GTATGGAAAT 


5340 


ATTATGATAA AATGTAAGAA AATATTTAGG 


5400 


GAAGACGACA 


GTTTAATGTT ACATAATGAC 


5460 


TGGAATGATA 


ATATTCATGA AAAAATGGCT 


5520 


TTTAATAGTG 


GCTATGCTGT TTTTAATGGT 


5580 
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TTAAAGTCTA TTGGCTACAA GGATGATTTC TTATCATATT TAAAAGATTT AAAATTCACA 5700 

GGCAGCATCC GTTCGATGCA AGAAGGCGAA TTATGCTTTG GTAACGAACC ATTGTTACGC 5760 

5 

GTAGAAGCAC CATTGATTCA AGCGCAATTA ATAGAAACAA TTTTATTAAA CATTGTAAAT 5820 

TTCCATACAT TAATTACAAC AAAGGCTAGC AGAATTCGTC AAATTGCATC AAATGATAAA 5880 

TTAATGGAGT TTGGTACAC6 TCGTGCGCAA GAAATTGATG CAGCATTGTG GGGCGCTAGA 5940 

GCTGCTTACA TCGGGGGCTT TGATTCTACA AGTAATGTTA GQGCGGGGAA ATTATTTGGT 6000 

ATACCTGTGT CTGGTACACA TGCACATGCA TTTGTCCAAA CTTATGGAGA CGAATATGTT 6060 

IS GCCTTCAAAA AATATGCTGA AAGACATAAA AATTGTGTGT TCCTAGTAGA TACATTCCAT 6120 

ACTTTAAAAT CTGGCGTGCC AAATGCAATA AAAGTTGCAA AAGAATTAGG TGACAAAATT 6180 

AACTTTGTAG GTATTCGATT AGATTCTGGA GATATCGCTT ATTTATCTAA AGAGGCAAGA 6240 

20 

CGTATGCTTG ATGAAGCAGG ATTTACTGAA ACTAAAATTA TCGCGTCTAA TGATTTGGAT 6300 

GAAGAAACGA TTACGAGTTT GAAAGCACAA GGTGCAAAAG TAGATTCTTG GGGCGTTCGT 6360 

ACAAAGCTGA TTACAGGATA CGATCAACCA GCATTAGGTG CAGTATATAA ACTTGTAGCT 6420 

25 

ATTGAAAATG AAGATGGTTC ATATAGTQAT CGTATTAAAT TATCAAATAA CGCTGAAAAG 6480 

GTTACX3ACGC CAGGTAAGAA AAATGTATAT CGCATTATAA ACAAGAAAAC AGGTAAGGCA 6540 

30 GAAGGOGATT ATATTACTTT GGAAAATGAA AATCCATACX* ATGAACAACC TTTAAAATTA 6600 

TTCCATCCAG TGCATACTTA TAAAATGAAA TTTATAAAAT CTTTCGAAGC CATTGATTTG 6660 

CATCATAATA TTTATGAAAA TGGTAAATTA GTATATCAAA TGCCAACAGA AGATGAATCA 6720 

^ CGTGAATATT TAGCACTAGG ATTACAATCT ATTTGGGATG AAAATAAGCG TTTCCTGAAT 6780 

CCACAAGAAT ATCCAGTCGA TTTAAGCAAG GCATGTTGGG ATAATAAACA TAAACGTATT 684 0 

TTTGAAGTTG CX;GAACACGT TAAGGAGATG GAAGAAGATA ATGAGTAAAT TACAAGACGT 6900 

40 

TATTGTACAA GAAATGAAAG TGAAAAAGCG TATCX3ATAGT GCTGAAGAAA TTATGGAATT 6960 

AAAGCAATTT ATAAAAAATT ATGTACAATC ACATTCATTT ATAAAATCTT TAGTGTTAGG 7020 

TATTTCAGGA GGACAGGATT CTACATTAGT TGGAAAACTA GTACAAATGT CTGTTAACGA 7080 

45 

ATTACGTGAA GAAGGCATTG ATTGTACGTT TATTGCAGTT AAATTACCTT ATGGAGTTCA 7140 

AAAAGATGCT GATGAAGTTG AGCAAGCTTT GCGATTCATT QAACCAGATG AAATAGTAAC ,7200 

so AGTCAATATT AAGCCTGCAG TTGATCAAAG TGTGCAATCA TTAAAAGAAG CCGGTATTGT 7260 

TCTTACAGAT TTCCAAAAAG GAAATGAAAA AGCGCGTGAA CGTATGAAAG TACAATTTTC 7320 

AATTGCTTCA AACCGACAAG GTATTGTAGT AGGAACAGAT CATTCAGCTG AAAATATAAC 7380 

SS 
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TAAACGAGAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

5 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 7680 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7740 

10 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7800 

AAGTGTATGA ACTAAA6TAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7860 

IS CGCTATTTAT TTATGAGAGT TTTCX5TACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCX5TTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

25 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 8280 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CXSCACTATGC 8340 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 8400 

30 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCX5TATGA 8460 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8520 

35 TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 8640 

TGAAAAACCC GTGTTTCAAG AGCATU^GA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

CGTtTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGJ\AG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

45 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAQACCTTT AATAAGATTA ATAGTTTACT 9000 

^ TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 

55 
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IS 



20 



25 



CATTTAAATC 


TTGAGGATGC 


CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 


9300 


AATTACATTT 


GAATTTATCA 


ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 


9360 


TTCTAAAATA 


CAAATTTTAA 


TAACCATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 


9420 


TGGATACTGG 


AATAACATTG 


TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 


9480 


TCCATATAAT 


TATAACCTCT 


TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 


9540 


CATGCCCTGC 


GTGCATACCA 


TTTCTTGATT CTACTCTACT ACCTAAAACA GCAATTCCTT 


9600 


TATTATTTTT 


AGTAAGATTG 


TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 


9660 


GACTTCTTCC 


CTGAGTTTGT 


CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCA6TAAATT 


9720 


GGAATCGTTG 


TCCTTGTAAG 


TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 


9780 


TGTTAGTATT 


ATAAGTCGCA 


TTTAGTAATT CAGACATC6T ATAGCCTGCA CACCAACCAT 


9840 


TGTTACCTTG 


AGTTTCTCTT 


ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 


9900 


AAGTATAATT 


ATTACTTTTA 


AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 


9960 


GTTGCGTTAC 


ATTATTGCCA 


GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 


10020 


GACGTGGTGT 


TTTTAGTACT 


AGTTTAGCTT TATGATTTTG AGTACCACAT AGTAACCTTT 


10080 


TGAGTTGT 






10088 



(2) INFORMATION FOR SEQ ID NO: 33: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



r (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



40 



45 



SO 



CGGAAACGnA CCCnATGCGT ATGCTTGACG 


TGCCAAAATT AAATACGAAG 


TTCATAGCTT 


60 


TGAGGTACCA GAAGAACATT 


TATCTGGTCA 


AGAAGTCGCA 


GnACTCATAC AAGCAAATGT 


120 


TAAAACAGTA TTTAAAACGC 


TTGTTCTAGA 


AAATACAAAA 


CATGAACATT 


TTGTATTTGT 


180 


TATCCCAGTA AGTGAAACTT 


TAGATATGAA 


AAAGGCAGCT 


GCTTTGGTTG 


GAGAGAAGAA 


240 


ATTGCAGCTT ATGCCTTTAG ATAATTTGAA 


AAATGTAACG 


GGATACATTC 


GTGGTGGGTG 


300 


TTCGCCTGTT GGTATGAAAA 


CATTGTTTCC 


AACAGTCGTT 


GACAAATCGT 


GTGAAAATTA 


360 


TAGTCATATC AGTGTGAGTG 


GTGGGCTTCG 


AACAATGCAA ATCACAATAG 


CTGTTGAGGA 


420 


TTTGATTACA ATAACTAAAG 


GCAAAATTGG 


AGCAGTTATC 


CATGAATGAT 


TAATAACAAC 


480 
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TGCCACACTC CTTTTTGATT GAATTAGCAT 
GTATTTGAAC ATAAAAATOT AATTTTATCG 
^ GTAATTTATG ATTGAAAAGT GAAAGCGTAC 

GATGATAATT ACTGaAAAAA GACACGAGTT 
TTTGACTTTA CAAGAATTAA TAGATCGAAC 

10 

TTTATCTAAA CTACAACAAT TAGGGAAATT 
AGAAAATCX3T ATGGTTGAGG CGAATTTAAC 
GAAAATGATT GCTAAAATAG CAGCTAATCA 
TGCTGGTTCA TCTACATTGG AGCTAATTAA 
AACCAATGGT TTAACACATG TAGAAGCTTT 
20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC 
AAGACGATAT TGTTTCGATA AAGCTTTTAT 
ATTAACTACT CCCGATGAGC AAGAGGCATT 
TCAATCATTT GTACTTATAG ATCATTCTAA 
TTTGCTAGAA AGTACGACAA TCATCACATC 
AGAATACCAA CAAAAGTATC ACTTTATAGG 

30 

AATCCTTCAA TTGACTATGT CATTTTTACG 
GCAACAGCAA CATATAAATT CGCTGGGGGG 

^ ACATTGGATG TTGAGTCAAC TGCCTTGGGA 
ATAGATACAT TAAATAACAG TGCAATTCAA 
CGTAJTAATG TGAAATTAAA AACAGGACAA 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA 
ATAGTTATTG TTGCTGGAAG TGTACCAAGT 

GCACAAATTA CAGCACAGAC AGGTGCTAAA 

45 

GAAAgCGTTT TACCATATCA TCCACTATTT 

ATGTTTAATA CAACAGTGAA CTCAGACACA 

GATAAAGGTG CGCAATCTGT TATTGTCTCG 

SO 

AAAGAAATCA GTATTAAAGC AGTTAATCCA 
GGTGATAGTA CAGTTGCAGG CATGGTGGCT 

55 



TTTAOGATCA TAAACAGTCA TTATAATTGA 600 

TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TGGTTGCAGT GCTTCAACAA TACGArGAGA 840 

GCAACX3TGTG CATGGTGGTG CAATGTTAAA 900 

TGAAAAATTA GCAACX3AATC TTGATGAAAA 960 

AATCAACGAT AATGAATGCT TATTTATC6A 1020 

ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

ACTTAAAAAA GGTATTAAAA CAATTATGCT 1140 

TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

AGTTAAACAA ACAGCAATGT CATTAGCCAA 1320 

GTTTAATAAA GTATATTTTG CTCGTGTACC 1380 

TGAAAAAGCA TTAAATCAAG AATCGTTAAA 1440 

AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 1620 

TTTGCAGGTG GATTTCCTGG GAAATTCATT 1680 

TCGAATTTTA TT6AAGTTGA TGAAGATACA 1740 

GAAACAGAAA TCAATGCACC GGGTCCTCAT 1800 

CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

AGTATTCCAA GCGATGCGTA TGCGCAAATT 1920 

TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

ATTAAACCTA ATAAAGATGA ATTAGAAGTG 2040 

GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

CTTGGCGGTG ATGGTGCTAT TTATATTGAT 2160 

CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GGAATTGCTT CAGGTTTAAC GATTGAAAAA 2280 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 
^ TTGCACTTCT AGTTGGCACA TTAGTTTCAG 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

10 

TATATCGT6T TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 
CAAATGTATA GACTTTTTTA ATATTTTGCA 
AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACXSGCAA AGTTTATACT 
20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 

TTTATTGATA TACATATTCA TGGTGGTTAT 

25 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

ACAATGACGC AATCGACTGA TAAAATAGAT 

GCGGAgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

^ GAAATTGAA6 GTGCAAAAGA AGCGCTTGAA 

GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

40 GCAGCATGGT TGAATGATGC TCTACATACC 
CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CXiAGTAACAA GTTTAAATCA AGCCATTGCA 

50 

AAAGTAAATA AGGATGCAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



CTTTAGGCTT AGGTTCACGA ATTACTGCGC 


4200 


CTGATGGTGC ACACTTACTT 


CAAACTCTTA 


4260 


CATTAATTTA CGGTTTAATC 


AAACCAAAGT 


4320 


CAATGGACGA GTAGTTTTAA 


TGATGTAAAA 


4380 


TGTATGTTCA ATGAATATAT 


GTTAGTTTTA 


4440 


CTGTAAAAAC ACTTTCTATT AATTCAGTTT 


4500 


TTAGATAAAT GAATGAAGAA ATAGACACCA 


4560 


AAAAGTTATG CCAAACGAAG 


CAGATATAGT 


4620 


TAAATAAAGA AGGGTTTATA 


CGTGTCAGAA 


4680 


GAAGATGGCA AAATCGATAA 


TGGTTACATT 


4740 


GGAGAAGTGG ATGATAAAGC AGCAATTGAT 


4800 


GATGCTAAAG GTCATCATGT ATTACCAGGT 


4860 


GGTCAAGATG CAATGGATGG 


GTCATACGAT 


4920 


TCTGAAGGGA CGACATCATA 


CTTGGCCACT 


4980 


AATGCACTTA CAAATATTGC 


TAAATATGAA 


5040 


ATTGTAGGTA TACATTTAGA 


AGGACCATTT 


5100 


CCGCAATACG TTGTACGCCC 


ATTTATCGAT 


5160 


GGATTAATAA AGATTATGAC 


GTTTGCACCT 


5220 


ACGTATAAAG ATGACATTAT 


TTTTTCAATT 


5280 


GTTGAAGCTG TTGAGCGAGG 


AGCTAAACAT 


5340 


TTCCAACATA GAGAACCAGG 


TGTTTTTGGA 


5400 


GAAATQATTG TTQATGGCAC 


TCATTCTCAT 


5460 


AAAGGTAATG AACGTTTTTA 


TTTAATTACC 


5520 


GGAGAATATG ATTTGGGTGG 


ACAAAAAGTA 


5580 


AATGGTGCGC TTGCTGGTAG 


TATTTTAAAA 


5640 


TTTACAGGTG ATACATTAGA 


TCATTTATGG 


5700 


TTAGGTATCG ATGATAGAAA AGGTAGTATT 


5760 


CTAGATGATG ATATGAATGT AAAATCTACA 


5820 


TAATAAATAA TCATAATTAA ATGTATGCAA 


5880 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 6060 

^ AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 6240 

10 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA OGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 6360 

TTTACCAACT GCATTAACGA OGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTAOGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 6480 

GCTTGCTTTA GTATATGCAA GACXa^TTGTT CTATTTCGGT AACATTATGA AACCATTGAT 6540 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GGAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 6840 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6900 

30 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 6960 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7020 

^ TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 
AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT . 7140 

TGAT5ATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATAOGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GQATACAACT TATGATCGCT GGGTTGTTTC 7320 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 7380 

45 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 7440 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

SO 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 
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<A) LENGTH: 3492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

TTATATCAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 60 

SATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 240 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 300 
AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT . 360 

20 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 420 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 480 

AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 540 

25 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AGCAGCATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

^ GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATIT 780 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

40 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 1080 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 1140 

CGATTAATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 1200 

45 

GCATCAAGCA TAATTTCGCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

50 AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 1380 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 1440 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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10 



TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 1740 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 1980 

75 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTA6A TGCCAATGAT ATATATTTAA 2040 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT OGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2280 

CaGTTGCTAG AATTATGCCA AACACAAATG COCAAGTTGG ACACTCTGTT ACTGGCATTA 2340 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA 2460 

GCGGCCCAGC ATTTTTATAT CATGTATTCO AGCAATATGT TAAAGCTGGT aCaAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 2640 

^ GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACA5ACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACX5 GGTTATTTTT 2820 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTTJ^CTAT 2880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2940 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3060 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 
AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT ' 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3240 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 3300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 3420 
TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 BO 

5 

TCCACATATG CT 3492 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

IS 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 
20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTO CACAAACTTA 120 
TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 180 
CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 240 
AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 300 
TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 360 
TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 420 

30 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 480 
GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 540 
GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

35 

AGAAGGTTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 660 
CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 720 
40 AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 780 
CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 840 
TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 
TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 960 
TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 1020 
CATTATTAGA TCACGAAGAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 1080 

SO 

TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 1140 
AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 1200 
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GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 1320 
TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 1380 

5 

CTGAGCTITA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG X440 
CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 
AAAAAGCAAA ATCTITAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 1560 

10 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 1620 
TTTTTGCCAA AGCAAAACCT AOTTTGCAAA TAATCAATGT GGCACGTGGT GGTATTATTG 1680 

,5 ATGAAAAGGC GCTAATAAAA GCATTAGACG AAQQACAAAT TAGTCGGGCA GCTATOGATG 1740 
TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 
CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT 6CACCTAAAA 1920 
TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 
(2) INFORMATION FOR SEQ ID NO: 36: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 
TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAOGCCT 120 
AAATfiAACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 180 
40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 
GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 300 
GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 
ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 420 
AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 480 
GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 540 

SO 

AATAAAACGA ATGATTTCAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 600 
TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 
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GTAAAATGAA AACCCGCTAC AAGTACACAT CTATATGGAG ACTCATTTGA AAGTCAACGC 780 

TTCGTTAACT ATACTAAAAA TATGTCATAC TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 

5 

CTATGCAAAT AAAATATTCC ATAACAAAGT ATATACTTTA CATTTTTATA ATTCTTAACA 900 

ATACTATTTT ATCAAACATT TACCACAATA AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

ATCATATAAT TGCGAGGAGA ATATTATGGA TTTCGTTAAT AATGATACAA GACAAATTGC 1020 

10 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA TCAGGATACC ACTCAAACGT ATACAGGCTA 1080 

CATCGTGGAA ACGGAAGCTT ACTTAGGTTT GAATGATCGT GCGGCTCATG GCTATGGCGG 1140 

TAAAATAACA CCTAAAGTCA CGTCATTATA TAAACGfTGGT GGTACAATTT ATGCACATGT 1200 

CATOCATACG CATTTACTCA TTAATTTTGT AACAAAATCT GAAGGTATAC CT6AAGGCGT 1260 

ACTTATCCGC GCAATTGAAC CAGAAGAAGG TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 1320 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG AAAATGGACT AAGGCATTTA ACATTCCACG 1380 

GGCTATCGAT GGCGCTACGT TAAATGACTG TAGATTGTCT ATTGATACTA AGAATCGTAA 1440 

ATATCCTAAA GATATTATTG CTAGTCCACG AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

ACATAAATCT TTACGTTACA CAGTGAAAGG TAATCCATTT GTGTCTCGCA TGCGTAAATC 1560 

AGATTGTATG TTTCCCGAAG ATACTTGGAA ATAAATGCCA TCTTTCATTG ATTACTATCA 1620 

TGAAAATGAA ATCTATCTCC TTATAAGTCA ATCAATCGTG CCGTCAACAT GCGGATGGGT 1680 

30 

TGATTGTTTT TCTTTGTATC CATCATATTT TTTGATTCAT CTCCTCTTAT TGAACTTGTT 1740 

CTTAATTATA AAATATAACA ATAGAATTAT TTATAATTAT TAAATTTAGA TGCATTAATA 1800 

TTATTGATAT TATTTTCAAA AACTAGAAAT ATTGATTTGT TGCATGTATA ATGTTAAAAG 1860 

35 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC TTATTTAGGG AGAGGGATAT TCAACAAOGG 1920 

GGATTTGAAA ATGATAGAAC TTAATGCAAT TACAACATTA TGTTTAGCTT GTATCCTTTA 1980 

40 TTTACTTGGT AAGGCTATCG TTAATCACGT TAATTTTTTA AAACGTATTT GTATACCAGC 2040 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT TTTAGTTGCG GCTTTGGATT CATTTGGCAT 2100 

GGTTAAGATT AAATTAGATG CTTCATTCAT TCAAGATTTC TTCATGTTAG CATTCTTTAC 2160 

^ GACAATCGGT CTTGGTGCAT CATTGAAATT ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

ATACTTTATG TTTTGTGCTA TCATTTCAGT CATTCAAAAC ATAGTTGGTG TATCACTAGC 2280 

AAAAGTATTA AATATTAAAC CTTTGTTAGG ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

SO 

CGGTCATGGT AATGCTGCTG CTTATGGTAA GACAATTCAA GATTTAGGTA TTGATTCGGC 2400 

ACTGACAGCG GCTCTTGCAG CTGCAACTTT AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 2460 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA CGAACATTTA CATAGTAAAT TTAATGCCAC 2580 

TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG GAAGTTATTT 2640 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG GCTCATTATT 2700 

TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG TAGATTTAAA 2760 

AATTACTAAT CAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG CGCTAATGAG 2820 

CATTCAATTA ATCGAAATTT ATAAACTTGC TATACCTCTT ATTATTATCG TTTTAGTTCA 2880 

AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG GAAAAGATTA 2940 

IS TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc GCCAAATGCC 3000 

ATGGCAAAIT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC ATATTTAOTT 3060 

GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT TATGGGATTC 3120 

ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT CGCCTCcTcT 3180 

TTTATTTATC CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA TATTTCCGGA 324 0 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA CATCGTCATT 3300 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTGC AATGCCATTA 3360 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC AAGTGCTAAA 3420 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA AATTTCGCTT 3480 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA TGATAAGTAA 3540 

TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT ACCACCTTTG 3600 

^ CCACTAACAT CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA TTCTTTGCTC 3660 

ATACCAAATC CAACTTCTTT TATAATGACT GGAACAGACA CTCGTGATAC AATCGACGCT 3720 

ATATIATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA TTCTTGAGGA 3780 

40 GAATTAACAT GGATTTGTAA CX5CTTGTGCC TCAAGTAATT CAACTG CTT C CAAAGCCTTT 3840 

TCTACPGGTA CGTCCGCACC AACATTGCTA AAAATCATGC CTTCAGGATT CATTTTTCGC 3900 

GCAATCGTAA ACGTCTCAGC CATGCGTGGA TTTCTCAATG CCGCATGTGT TGATCCAACT 3960 

45 

GCCATCGCTA AGCCAGTTTC TCTTGCAACT ACAGCTAGCT TTTCATTGAT GTTTTTCGTC 4020 

CACTCGCTAC CACCCGTCAT TGCATTAATA TAAACCGGAT ATGCCATCGT TAAGTCAGGC 4080 

GTCTGTGATG TCAAATCGAT ATCATTTACA TTAATTGATG GGATAGAATG ATGCACAAAA 4140 

SO 

CGCATCTTAT CAAAATCTGA ATGCATTGCG TCAGATTGGG CCATTGCTAT TTCAACATGT 4200 

TCATTTTTTC TCTGTTCTCT TTGAAAATCA CTCATGATTA AACCTACCTT TTCGTCATTT 4260 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4380 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 4440 

5 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACX5TT GCTTAATCAA GTGGTCATCG 4560 

ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCX5AT CATCTACATG GCX;GTCTTCA 4740 

75 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4800 

TCEAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4920 

on 

TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4 980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 5040 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCnTAA ACCATTCATT TTCTTGTTCA 5100 

25 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 5160 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAAC6TG TATGTCTTAA 5280 

30 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 5340 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 5400 

35 ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 5460 

AAC60GAAAA TTTAGCATAT AGGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTCTCTCT TGGCGATGTA TTTACTGTTA CTTTA7ATAT TTATGATTAC GATTATAAGC 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACTAGCATCA ACAAATGAAG 5640 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

45 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 5880 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 5940 

SO 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 624 0 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6300 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 6360 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 6420 

10 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 6480 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 6540 

,5 ATCCTTCTAT TGTGCCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 6720 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 6840 

GaTCATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6900 

25 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6960 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TATCAATTTG ATGTCTGAAA TTAAAGCXSAA AGACTTGCAT ATCATCTAAT GACAATTTTT 7080 

30 

CTAAATTT6C TTTAACATTT AATGTTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 7140 

CTCTTTTATA AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 7200 

GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 7260 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 7320 

GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 7380 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 7440 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 7500 

TCAACAACCG AATATTCATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7560 

CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 7620 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
SO (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 





GTCATtACCG amTTTCtTAG AaTCATTTAA AGATGATAAA TATACAAACG 


TTGGTAATTT 


60 


5 


AAAAGAAGTG AATTTTGATA AAATTGCTGC GACGAAACCC GAAGTAATCT 


TTATCTCTGG 


120 




ACGTACAGCT AATCAAAAGA ATTTAGATGA ATTCAAAAAA GCTGCACCTA 


AAGCGAAAAT 


180 


10 


TGTTTATGTT GGTGCAGATG AAAAGAACTT AATTGGTTCA ATGAAACAAA 


ACACTGAAAA 


240 


TATCGGAAAA ATTTACGATA AAGAAGATAA AGCTAAAGAA TTAAATAAAG 


ATTTAGATAA 


300 




CAAAATTGCT TCAATGAAAG ATAAAACGAA AAACTTCAAT AAAACTGTTA 


TGTATTTACT 


360 


IS 


AGTTAACGAA GGTGAATTAT CAACATTTGG ACCTAAAGGT CGTTTTGGTG 


GATTAGTTTA 


420 




CGATACATTA GGATTCAATG CAGTTGATAA AAAAGTAAGT AATAGCAATC 


ATGGACAAAA 


460 




TGTTTCTAAC GAATATGTTA ATAAAGAAAA TCCAGATGTT ATTTTAGCGA 


TGGATAGAGG 


540 


20 


TCAAGCGATA AGTGGTAAAT CAACTGCGAA ACAAGCATTA AATAATCCTG 


TATTAAAAAA 


600 




TGTTAAAGCA ATTAAAGAAG ACAAAOTATA TAATTTAGAT CCTAAATTAT 


GGTACTTTGC 


660 




AGCTGGATCA ACTACAACTA CAATTAAACA AATTGAGGAA CTTGATAAAG 


TTGTAAAATA 


720 


25 


ATTTTAAAAG AGGGGAACAA TGGTTAAAG6 TCTTAATCAT TGCTCCCCTC 


TTTTCTTTAA 


780 




AAAAGGAAAT CTGGGACGTC AATCAATGTC CTAGACTCTA AAATGTTCTG 


TTGTCAGTCG 


840 




TTGGTTGAAT GAACATGTAC TTGTAACAAG TTCATTTCAA TACTAGTGGG 


CTCCAAACAT 


900 


30 


AGAGAAATTT GATTTTCAAT TTCTACTGAC AATGCAAGTT GGCGGGGCCC 


AAACATAGAG 


960 




AATTTCAAAA AGGAATTCTA CAGAAGTGGT GCTTTATCAT GTCTGACCCA 


CTCCCTATAA 


1020 


35 


TGTTTTGACT ATGTTGTTTA AATTTCAAAA TAAATATGAT AGTGATATTT 


ACAGC6ATTG 


lOBO 


TTAAACCGAG ATTGGCAATT TGGACAACGC TCTACCATCA TATATTCATT 


GATTGTTAAT 


1140 




TCGTQTTTGC ATACACCGCA TAA6ATTGCT TTTTCGTTAA ATGAAGGCTC 


AGACCAACGC 


1200 


40 


TTAATGGCGT GCTTTTCAAA CTCATTATGG CACTTATAGC ATGGATAGTA 


TTTATTACAA 


1260 




CATTTAAATT TAATAGCAAT T^TATCTTCT TCGGTAAT^T AATGGCGACA 


scgTGTTTCA 


1320 




GTATCGATTA ATGAACCATA AACTTTAGGC ATAGACAAAG CTCCTTAACT 


TACGATTCCT 


1380 


45 


TTGGATGTTC ACCAATAATG CGAACTTCAC QATTTAATTC AATGCCAAAT 


rrnvrri'GA 


1440 




CGGTCTTTTG TACATAATGA ATAAGGTTTT CATAATCTGT AGCAGTTCCA 


TTGTCTACAT 


1500 




TTACCATAAA ACCAGCGTGT TTGGTTGAAA CTTCAACGCC GCCAATACGG 


TGACCTTGCA 


1560 . 


SO 


AATTAGAATC ITGTATCAAT TTACCTGCAA AATGACCAGG CGGTCTTTGG 


AATACACTAC 


1620 




CACATGAAGG ATACTCTAAA GGTTGTTTAG ATTCTCTACG TTCTGTTAAA 


TCATCCATTT 


1680 



55 



334 




335 



EP0 786 519 A2 

AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3600 

AAGCGTTAAG AGATACTGCX3 AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 3660 

^ GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CGCAATAGAA ATGTATTACA 3780 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTTATCAm 3840 

10 

AAGTCGATGT AGATAAGTTG CCGAAAAAGG GTATAGATAT AACTCACGGC GATAAAGCCT 3900 

TTGAAAAAAA GCTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3960 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4020 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4080 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA. 4140 

20 ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4200 

ATCAATTTAA TAAATTTTTA TATQAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 4260 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CXTTTAAACAT GAATCACCTC 4320 

2S AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AGACCACGGT GATATCAATG 43 8 0 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 4440 

CACTACTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 4500 

30 

TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTTAAGTTAG TACTTTTTTA 4560 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4620 

CCAATAAGAA AATTTAAACA TGATTTGTAA GTTAGTTTAA TAGGAAATAT ATOCTAAACC 4680 

35 

AAAAGAAGCA TATTGTTATT TACTGQAATA ATTAATAATC ATGTCATGTT AAATGTTAGC 4740 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4800 

^ CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG TATTAGTAGC CATCGATGAG 4860 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4980 

45 GATTCTGTAT TTTTATGGcm ATCTGGGcaA AAGTATCGTA TACCAGTACA TGATGCAGAT 5040 

ATGAATGAAG TTGAAGGTGT TTACX3GACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5100 

AATTTATTTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

SO 

ATGATTAAAC ATGCATTAGA TAACGACGCA AAACATGITG TAATTTCACT AGGTGGGATT 5220 

GATAGTTTTG ATGCTGGTGC AGGTATGTTA CAAGCATTAG GTGCTCAATT CTATGATGAC 5280 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 5400 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 5460 

AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 5580 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 5640 

CTAACACATT TAGAAAATTT AGTTGAACT^ GCGGATTTAA TTATTTTTGG AGAAGGATTA 5700 

AATGAAAATG ATCAGTTGCT AGAAACX3ACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 5940 

2^ TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATCX3AAAG TATACGTAAA AAATATGAAA 6060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC 6ATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 6240 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 63 00 

CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC 6360 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 6420 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCXSTCACGTA 6480 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 6540 

ATTCSCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCfTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6660 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCXSTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6780 

45 CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6840 

GACCGATTTC TTCX5TCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6960 

$0 CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 7020 

GAAATACTTT AGGATCTAAG ACACX3TTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 7080 
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GCX5CCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

AGTAGCCATT TTCATCTAAA TCA6TTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 7260 

AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 7380 

ACCCCTTAAA CTCTATTATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7500 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

CATTTGATAC TAAAAAAOGA GTCGCTT6AT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

TGCCTG6CAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCX3CCAAC TTTTCCAAAC TCTAAATCTA 7740 

CCGTTAATTG GTAAATCGTC CATGCAATQG CACCCACAAA TCCACATGCT ACTAAGAGGC 7800 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

2S GACACCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 7980 

AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8040 

TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 8220 

CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 8280 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGAT6CA TGCTGTAAAT GAATAAATTC 8340 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 8400 

TAAAGCGATT TTCTCTAAAT CTGTTGTACX5 CTCTTGTACA OGAATTAATC TTGTACTTGT 8460 

TCGATCGTTT AATGAAAAAA TAATTGCACT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

45 TTCaAGTAAA ATTCTACCTG CAATTAATAC JVACATCAATC ACTTTGTTTT CATCTATAAT 8640 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 8700 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

50 CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 8880 
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TTTACGCTGT 


GATTTTGGAT 


CGTCATCTGT TAAATAACCA 


ACACCGATAG ACACTGACAA 


9000 


TTTAATAACT 


TCTTTGTTTG 


GTAAATGGAA TGATGATTTT 


TCAACACCCG AACGAATATT 


9060 


TTCAGCTAAT 


TTAACACTTT 


GATCAAGTGA ATAATTGTGA 


ATGACAACTG AGAACTCTTC 


9120 


GCCACCATTT 


CTAAAAATTT 


TAAATTGATT CGGCACATAG 


TTTTTAAGTA ATTGAGACAT 


9180 


TTGTTTTAAT 


ACAGCATCAC 


CTGATTTGTG TGAGTAGGTA 


TCATTGaCAT CTTTAAATCC 


9240 


ATCGATATCG 


ATTAATAATA 


ATGCGATACT TTGATGTTCT 


TTTTCAGCTT TTCGTGAAAT 


9300 


TTCATTTAAA 


TGTCTATCAA 


ATTCTTTTAC ATTACCTAAG 


CCTGTTAAGT AATCATATTT 


9360 


ATCTTOGTTT 


TCATAACGAT 


TTACGAGTGA GAAGAAATGC 


CAAATATCGA CAAATGTTAT 


9420 


CJGCTGAAGCT 


AAAGTGATAA 


TTAATGAAAT TGGTATTAAA 


ATGATAACTT CCX3ATAGTGT 


9480 


GTAAATAGGA 


CTCACTAACG 


CGACACCAAA TAAAATGATT 


ATTGTAACAA CATTAAGTAT 


9540 


TAATAATGAT 


AGCACATCAT 


TTTGTTTTAA AAATGGTCCA 


ATAGCACTTG TTACTGCAGC 


9600 


AATAACAATC 


AACGTAACAC 


CGTACATAAT CGAGTTGTTA 


AATACTACAA TTTCAACAAT 


9660 


TGCTACAATT 


ACTGTGGCAG 


ATAATGTATA GACCATATTT 


GTAAATCTAC CTAAAAACAA 


9720 


TAAAGGAACG 


AATGTTAAGT 


GAATTAAATA ATCTTCACGA 


TAAGGGATAG GGTAGACAGA 


9780 


TAATAATAAT 


GATACGATTG 


TCATTAAAAC AGTGACATAA 


GCCTTAGAAA AAAC 


9634 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

^(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

40 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 
GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

45 GAGCTTGAAG CGGTAAGTAT TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 240 
AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 300 
ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGGCATATT AAAAGCTCAA 360 

SO GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTOGC AGACGGTCAT ATTAGCACTA 420 
TTTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 480 
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TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 600 

ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 660 

^ GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 720 

ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 780 

TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 840 

10 

CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 900 

CTTGTACCAT TTTTAACX5GT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 960 

GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 1020 

75 

AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 1080 

TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCX5TTA GATACTGAAA 1140 

2^ ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 1200 

TTATTGTTAC ACATGATGAA CX5ACTTAAAG CATATTGTGA TCX3TTCATAT CATATGAAAG 1260 

ATGGCGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 1320 

2S GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTCA ATTAATAATA 1380 

ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 1440 

TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 1500 

TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 1560 

TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 1620 

TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 1680 

35 

AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCX3CT GCCGCTGGAT 1740 

TAAC&GCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 1800 

GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 1860 

40 

ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 1920 

CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCA6TAGT GAATAGTACA TTGATTCGAA 1980 

45 TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 2040 

GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 2100 

TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGGCATTTA 2160 

SO TGAATGGGAC 6CTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 2220 

GTATTGAACC TTTATCACAA GCAGATATTG TATCA6CCAA TCCAATTCCA ATCTATATTA 2280 
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ATGCGACAGG TACAGCTACA CCGATTGCAG GATTTTTAGT TATGTTTGGA TTTAATCATC 2400 

CGACGACAAT TGTGATTTAT GGTGTAGTAA TGGCGATTGT AGGTGCGCTT GCAGGTTATC 2460 

TTGGTTCAAT TGTATTTAAA AAATATCCAA TTGTTACTAA GCAAGACATG ATTAATCGAG 2520 

GTGCAGTAGA CGCATAGCAT CATCATATTG AATAGTAAAA ACAAATAAAA CATAGTAACG 2580 

TGATTCAGTC GATGTAACAG TCGATAATGA GTCACGTTTT TTTATAGAAA AATACAAGAC 264 0 

ATAAAAATGT CATAATTTAT TGTCGACAAA TATCATACTG TATAAACATT TATCATTTTC 2700 

TCAAGTACCT TTTACACGAT GGAATGAACT TACTTTTTAC GAAATTATGC GTATTTTATA 2760 

AACAAATATC ATTGATATAA CGGTAAATGT AAGCGTTTAC AACAGAAATA ACAGCATGCT 2820 

ACGATATTTT TGTAAATTCA CTGATTCAAG TATTTTAAGT CAATATGAGG AGGGATGTTA 2880 

TGAGCGATTC TGAGAAAGAA ATTTTAAAAA GAATTAAAGA TAATCC6TTT ATTTCACAAC 2940 

GTGAACTTGC TGAGGCAATT GGATTATCTA GACCCAGCGT AGCAAACATT ATTTCAGGAT 3000 

TAATACAAAA GGAATATGTT ATGGGAAAGG CATATGTTTT AAATGAAGAT TATCCTATTG 3060 

TTTGTATTGG CGCAGCGAAT GTAGATCX3TA AGTTTTATGT GCATAAAAAT TTAGTTGCAG 3120 

25 AAACATCAAA TCCTGTAACG TCAACACGCT CTATTGGTGG CGTAgCAAGA AATATTGCTG 3180 

AGAACTTAGG TAGGCTTGGC GAAACGGTCG CTTTTTTATC TGCTAGTGGA CAAGATAGTG 3240 

AATGGGAAAT GATTAAACGA TTGTCCACAC CATTTATGAA TTTGGATCAT GTTCAACAAT 3300 

30 TTGAAAATGC GAGTACAGGT TCATATACAG CTTTAATTAG TAAAGAAGGC GACATGACAT 3360 

ATGGCTTaGC AGATATGGAA GTGTTTGACT ACATTACGCC TGAATTTTTA ATTAAGCGTT 3420 

CACACTTATT GAAAAAGGCT AAGTGCATTA TTGTAGATTT GAATTTAGGC AAAGAGGCAT 34 80 

TAAACTTCTT ATGTGCCTAT ACCACGAAAC ATCAAATCAA ATTAGTTATC ACCACGGTTT 3540 

CTTCCCCAAA AATGAAAAAT ATGCCTGATT CATTACATGC TATTGATTGG ATTATCACGA 3600 

ATAAAGATGA AACAGAAACA TACTTAAATT TAAAAATAGA ATCTACTGAT GATTTAAAAA 3660 

TAGCTGCTAA ACGCTGGAAT GATTTAGGTG TTAAAAATGT TATTGTGACA AATGGCGTGA 3720 

AAGAACTCAT TTATCGAAGT GGTGAGGAAG AAATCATTAA GTCAGTTATG CCATCAAATA 3780 

GTGTGAAAGA TGTTACAGGT GCAGGCGATT CATTCTGTGC TGCAGTAGTG TATAGCTGGT 3840 

TAAATGGGAT GTCTACTGAA GATATATTAA TTGCTGGTAT GGTTAACGCA AAGAAAACGA 3900 

TAGAAACGAA ATATACAGTT AGGCAAAACC TAGATCAACA GCAACTTTAT CACX3ATATX3G 3960 

AGGATTATAA AAATGGCAAA TTTACAAAAG TATATTGAGT ATTCTCGAGA AGTTCAGCAA 4020 

GCACGGGAGA ACAATCAACC GATTGTAGCA TTAGAATCAA CAATTATTTC GCATGGTATG 4080 
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GCCATTCCAG CAACCATAGC CATTATAGAT GGCAAAATTA AAATTGGTTT AGAAAGCGAA 4200 

GATTTAGAAA TACTGGCAAC TAGTAAAGAC GTTGCTAAAG TATCTAGAAG GGATTTAGCA 4260 

GAAGTTATTG CGATGAAGTG TGTTGGTGCT ACTACTGTAG CGACGACGAT GATATGTGCT 4320 

GCAATGGCTG GTATTCAATT TTTTGTTACA GGAGGTATTG GGGGCGTCCA TAAAGGTGCA 4380 

GAACATACGA TGGACATTTC AGCAGACTTA GAAGAACTGT CTAAAACAAA TGTCACTGTT 4440 

ATCTGTGCAG GTGCCAAATC AATTTTAGAC TTACCTAAGA CGATGGAGTA TTTAGAAACA 4500 

AAAGGCGTTC CAGTTATTGG ATATCAAACG AATGAATTGC CAGCATTCTT CACTCGCGAA 4560 

AGCGGTGTTA AGTTAACAAG TTCGGrTGAA ACGCCAGAAC GACTTGCTGA CATTCATTTA 4620 

ACAAAACAGC AGTTAAATCT TQAAGGTGGC ATTGTTGTTG CTAATCCAAT TCCATATCAG 46 BO 

CATGCCTTAT CAAAAGCATA TATTGAGGCA ATCATAAATG AAQCTGTTGT TGAAGCGGAA 4740 

20 AATCAAGGTA TTAAAGGTAA GGAC!GCCACA CCGTTCTTGT TAGGGAAAAT TGTAGAAAAA 4 BOO 

ACGAATGGTA AAAGTTTAGC AGCAAATATA AAACTTGTTG AAAACAATGC GGCGTTGGGT 4B60 

GCTAAAATTG CTGTCGCTGT TAATAAATTA TTGTAGGTGA TGATACATGA ATATTTTATT 4920 

25 CGCTATCACA GGGATAGCAT TTGCACTATT TGTTGOGTTT TTATTCAGTT TTGATCGTAA 4 980 

AAAAATAGAC TTCAAAAAGA CGTTAATAAT GATATTTATT CAAGTGTTGA TCGTGTTATT 504 0 

TATGATGAAC ACAACGATTG GTTTGACAAT TTTAACTGCA CTAGGTTCAT TTTTTGAAGG 5100 

GCTAATAAAT ATTAGTAAAG CAGGCATAAA TTTTGTTTTT GGAGATATAC AAAATAAAAA 5160 

TGGCTTTACG TTCTTTTTAA ACGTATTACT GCCATTAGTT TTTATTTCTG TATTAATAGG 5220 

CATCTTTAAT TATATTAAGG TATTACCATT TATTATCAAA TATGTAGGTA TCGCTATTAA 52 BO 

TAAAATAACT AGAATGGGGC GCTTAGAAAG TTATTTTGCT ATTTCAACAG CAATGTTTGG 5340 

GCA^icAGAA GTATATTTAA CAATAAAAGA TATTATTCCA AGATTATCTA GAGCX5AAATT 5400 

ATATACAATT GCXSACGTCTG GTATGAGTGC TGTTAGTATG GCAATGCTAG GTTCATATAT 5460 

GCAGATGATT GAACCCAAGT TCGTAGTTAC AGCAGTAATG TTAAATATTT TTAGTGCGCT 5520 

TATCATCX5CC AGTGTAATCA ATCCCTATAA ATCTGATGAT ACTGATGTTG AAATTGATAA 5580 

45 CTTAACGAAA TCCACAGAAA CTAAAACATT GAATGGAAAA ACAGGAAAAC CTAAGAAAGT 5640 

TGCCTTTTTC CAAATGATTG GTGATAGTGC GATGGATGGG TTTAAAATCG CTGTTGTAGT 5700 

AGCCGTAATG TTGTTAGCAT TTATTTCATT AATGGAAGCA ATTAATATCA TGTTTGGTAG 5760 

50 TGTTGGTTTG AACTTTAAAC AGCTTATTGG CTATGTGTTT GCACCAATCG CATTCTTAAT 5820 

GGG6ATTCCA TGGAGCGAAC TGTTCCAGCT GGCTCTTTAA TGGCGACTAA ATTAATTACA 5880 
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CAAGGTATCA TTTCAGTTTA CTTAGTAAGc TTCGCTAATT TTGGTACGGT TGGTATCATC 6000 
GTAGGTTCAA TTAAAGGCAT TAGTGATAAA CAAGGAGAAA AAGTTGCATC CTTTGCAATG 6060 
AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA ATCATTTCAG GATCAATCAT TGGCTTAGTA 6120 
TTGTAAATGA ATCGAAGTAC CTAAATTAAA TTCATGGCAA AGCTAAACCC CGTCACCAAG 6180 
TTGGCGCAAC AGCGcATgcA TAACTTAGTG ACGGGGTTTT ATCATAACAA TCTACTTTTT 6240 
CGTAGCC6TT TTTGAAATGT ATGTTGATGG TTTATCTTTT TCAAAAATTG TTAATCCCGT 6300 
TATATCTTTT TTATGTTTTG AAGGGACAAT GAAGCTAAGT ATATAAGCAA AGACAAAAGC 6360 
AACTGTAAAT GAAATGGTAG ATACATAGAA AGGTGAGTTA CCTTTGCCAA CACCATTATA 6420 

GACATAAGCA AAGATGATAC CCAATATTAA TCCACAAATA ACACCGAATG TATTCGTACG 6480 

TTTAGTGAAA ATACCAACTG CAAATACACC AGCCAATGGA ACGCCGAATA ATCCAGTCAC 6540 

20 AAACAAGAAT AAATCCCATA AGTCATTTGA ATTAGAAGCA ATTAAGTATA GTCACATTCC 6600 

AAAACCGAAA ATACCTGCAA TGATAATAAT GAAACGTGCA AAGTTAACTT CGTGTCGCTC 6660 

GCTACCTTTT CCGAAGAAGC GTTGCTTAAT GTCGATTGAA ATACAAGCAG ATATAGAATT 6720 

TAAACTAGAT GAAATGGTAG ACTGTGCAGC GGCGAAAATG GCTGCAATAA GTAATCCTGC 6780 

TACAAATGGT GGCATCTCAG TCAAAATGAA ATATGGCACT ACAGATGATG TATTGAAGCC 6840 

TTTTGGTAAA ACAGCTTCAT GTGTATAAAA TGAATACAGC ATTGTACCCA TACCATAAAA 6900 

TAAGGGTGCT GAAATTAAAG CTAGGATACC ATTTGTCCAT AACGATTTAT TTGTTTCTTT 6960 

TAAACTATCA GAAGCTTGAT AACGCTGCAC GACGTCTTGA CTCGCTGTGT ATTGATACAA 7020 

GTTGTTGAAA ATATTTCCTA GGAAAATAAT TGGAATGGCA GCTGCCGCAG TATTTAGTTT 7080 

CCAATTGTCT GCACTAATTA ATTTTTTGTG CTCAATCGCA TCTGCAAAGA CAGTGCCGAA 7140 

ACCGtCTTTA ATGTTCACAA CACCTAGAAT AATAATAACT AAAGCGCCGC CTAATAAAAT 7200 

GACGCCTTGA ATGAAATCAC TCCAAACCAC ACCTTCGAAA CCACCTAAAA ATGTATATAA 7260 

AATACATAGT AAACCAACGA GTGATGCAAC GATATAAGGG TTCATGTCTG ATACAGATGT 7320 

GATTGCTAAT GTTGGTAAGT AGATAACAAT TGCAACACGC CCTAAATGGT AAACGACAAA 7380 

45 TAATAATGAG CCAAT6ACAC GTATGCTAGG GCCAAATCTA GCTTCTAAAT ATTCATATGC 7440 

AGATGTTACC TTTAACTTTT TAAAGAAAGG GACATAGAAA TAAATAAGTA ATGGAATAAT 7500 

TGC6ACGATA GCAATGTTAC CAGCGATATA TGACCAATCT GTTAAAAATG CTTTCTCTGG 7560 

TGTCGACATA AATGTAATCG CACTTAACGT AGTAGCATAA ATTGAAAAGC CAACTACCCA 7620 

AGATGGCAAG CGACCACTTG CGGTAAAGAA ACTATTGGTA CTTTGGCTCG CGCGCTTGGT 7680 
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TGTGCCAAAT CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 7800 

TGTCTATAAA TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 7860 

AGGTTTGAAA GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 7920 

CAATGTTGGA TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 7980 

CAGTTGGTAA GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 804 0 

GCGACCATTA ACGTTATATG TAGAACCAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 8100 

AACTAACATT TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 8160 

TTCGAGTAGG AAGAAGTTTG GCGCTGTATA TTTAACACCA ACAATTTTTT CATGATTAAA 8220 

TAGCTCGCTG AATTGTTCAA TAGAAATATT CACACCTGTT AAATCTGGTA TTGCATAAAT 8280 

AATCATATTG TTCTGAGTTG CTTOGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 8340 

AGTAAATGGA TAGTAGAATG GTGTTACGGC AGAAAGTGCA TCATAACCX5A GTTCTGTGGC 8400 

ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAAC6AA CCTACTT6AG CAATCAATTT 8460 

CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 8520 

25 TAATAAAAAG TTTTCGCCTG AGCTACCATT TACATAAAGA CCGTCTAATT CTTCAGTTTC 8580 

AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 8640 

AGGAACGAGT AACGCTGCAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 8700 

^ AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 8760 

ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 8820 

CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 8380 

TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 8940 

GGACAACAAA AGTGAGCTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 9000 

GCAATATCAG TTGATCCAAC CTGTCATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 9060 

AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 9120 

GCGATTATTA AAATCACTGT CTCCTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 9180 

ACTAGGCXa^ TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 9240 

TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 9300 

TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 9360 

SO ACGTGCTGCA ACGAGTGCAT TGAAAAAGCXi^ CATGATTGCC GGAGGATTTA CGAGAAGCAC 9420 

ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT QATATTGCAA AACAAATATT 9480 
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AGGGCTTATA 
CGAGCCGAAA 
TAAGAGTAAA 
CATTCTAAAA 
TCACACTCTA 
AAATTGAAGA 
ATTAAAAAAG 
GGATTTGAAT 
GTTTGACTCA 
GTCGCAATTT 
AGTACTAAAT 
TATGCACCAT 
TTTGACATAG 
TGTGAATCGG 
GTCGCTGATA 
TTGTCGATAA 
TGTTGATGAT 
ATTTCTTGTT 
AAGCGTGTCA 
AAGGCATCGC 
AATAAATGTT 
AATGATTACT 
ATCTTAATGA 
ATGTTACCAC 
TCTTTTATTA 
GCAAATACTA 
ATTGTGAAAC 
GATGAACTGA 
CCGAAAGAAA 



TTAATTGGGG 
GTTGCACACT 
AATGATGCAG 
TAGAATTTGA 
TCTAGGACCA 
AGCTGAGATA 
CACCTAGCAA 
ATACTACTAG 
GATTCGTATT 
GTGTGTTGAT 
CTGCACAATC 
GAGATTTGGC 
CAATAAACAT 
TAGTAACATT 
AACCAGAGCT 
AGGTTTGGAT 
ATTTATGAAT 
GAATATTAAA 
TCGTTGCTGG 
TATAACCATT 
GATAACGTTG 
ATTATATATG 
TATATTGTAA 
ATGGATTAAT 
TGTCGAAAAT 
AGGAAGACAT 
GTGACTATGA 
TAGAAAGCCA 
CGTTAGACGA 



GCGGTATATC 
ATTTACCAAA 
CATTATATGG 
AACCGTTACG 
ATCTAAACTA 
TTAAAATTTT 
CTCGTTGGGA 
TTATTTGTTG 
TTCTAATAAA 
AAATTGATGG 
TGTAAGTTTA 
GACTTCCGCT 
ATCCGAATGA 
ACCTTTTAGC 
ACCTAGTCCA 
AATGTCGTTA 
TCTTTGAATA 
TTTTAAATCT 
TGATGTACCA 
TTGTCTTATA 
AACACGATTC 
AAAAATATTT 
ATGACTTTAC 
AGTATCTTGT 
GGCATTAGCT 
TTTAGCAATT 
TCACTCAGAT 
ATGTGAAGTC 
ATTAGTATCA 



TGAACAAGGA 
AGACTATGTT 
CTGTTTGCAA 
AGAGATGAGA 
TATCAACCAA 
AGAAAATGTA 
CAATCACGAT 
TCTAGGATAA 
TGATAACTCA 
TCGGTATTAC 
CTACCTTCAA 
GCAGAAATTA 
GATAGTAGGG 
CCCATACGAA 
GCAAAGAGTA 
TCAATAAATT 
ATTGGGCTAT 
TGGAAATTCT 
ATCGCATGGG 

taattgacx;a 
tcaaatttca 
tcaagatagt 

GTGAAAAAAC 
CAGGCACTAC 
GCGTATGAAG 
AAAGAAACGG 
GTTTTCATTA 
ATTGCATTGG 
TATATTAGAA 



GATAATCTCA 
TATGCACCAA 
TGATAGTTGA 
GCTGTTGTTA 
CAGTGTGCCA 
AAAAAATATT 
GATTGTCTAC 
TA6ATTTAGT 
CGATATCGAT 
OCGATTGATC 
AATTTGTGAT 
ATTCCGAAGT 
ATGCCGATAT 
TCATACGATA 
TATGTCGACT 
CACCAGTTTG 
TTTCAATAAC 
CATAATCCAG 
CTAAGGAGTT 
TGCX3TTTATC 
TTGTGTCACC 
AAAAAGCATT 
GACTTATGGA 
CAGATGAACC 
GTGGTGCTOT 
TAGATTTACC 
CTGCAACGTC 
ATGCAACGTT 
CACATGCACC 



TTAAATATAT 

TACAAACGAC 

AAGAAGGAGT 

GTTCCACACA 

CGGGCAAATT 

TGGTATTGAA 

AGTTGCAGGT 

ATGTTGATAA 

TAAAAAGAGT 

CGTTGTTAAA 

GGCAACGACA 

ATTACCACTA 

TTTCATTAAA 

ATAAAATTCA 

TGATTGAAGT 

TTGAATGATT 

TGTCTCTGTC 

CTTATGACTA 

AATCGTTGAA 

AGTTTTTGTA 

CCTTCATCTT 

GATAAAAATT 

GTGAGGAATA 

ATTGCATTCA 

TGGTATTCX3C 

AGTTATTGGC 

AAAAGAAGTT 

ACAGCAACGT 

GAACGTTGAA 



9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
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TATATTGGCA CGACGTTACA TGGCTATACT 

AATGACTTCC AATTTTTAAA AGATGTACTA 

^ GGTAATGTCA TTACACCGGA TATGTATAAA 

GTCGTTGGTG GTGCGATAAC ACGACCAAAA 

GAAGATTAAA TGATAACGAT AAAAAAACGA 

10 

TATCTTAGGT GGCTGAATGA ATGTAATGGG 

ATTTTATTTT CAACTTTATC CAAAAATAAG 

CAACGCCAC6 CX3TAAAAACC AAATGATGAT 

75 

TGCTAGTATA CATGGCACTA ATGCTGAGAA 
GACX5ACAAAT TTAGTACTCA TTGCAGCTTT 

2^ TACAATAGAA AckCTGACGC TTTTGCTAGT 
ATAAATGGAT AGAAGATATA GCCAAGCCAA 
AGTCCTAAAA AACCAATCGA TAATATAGAA 

25 TCTTTCAAAT TGTCCCAAAC GTTCTTCACG 
GCCTCTGCAT ATGCAGTTTT CAGTCTGCTT 
TGTCCGTTAT AATATTCTGT TGATTCATTG 
GTCACGACAA ATGTGATGAC TAAAGTTATC 
CCTAAAGTTT TAGCAACGAT AATCATAAAA 
ATAATCGTGG CTTCTCGTTT GTTGTACATC 

35 

AATCCTAAGG AATAACTGCC GACAAACGAA 
GirrTAAAAA TAGGTCTCAT AATAGGCTCC 
TAGCCCACTA ATAAAGAAAG CGcAATTGCA 

40 

ATTAATTTTT CAAACAAAAA CGGACCATAG 
TTAAATACAT ACATTATACC GATCATTGCA 
TTTGTGATTG AAGTCATAAA AGTACGTCTC 
ATAATCAGTG CAACATAGGG CATAAGTGGA 
TGATCGACGA AAATAGTGTT GTTACCATTA 
SO CCCACTAAAC TATAGACAAA AAAACGCCAT 
TGATTCATTA AAGCAACCCC TTTGTTTAAA 

55 



AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 


CAAAGTGTTG 


ATGCAAAAGT 


TATTGCGGAA 


11460 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


11580 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 


TAAAGCTGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


GAGTTTCCA5 


ACAGGTATTT 


TAATTTCAGT 


11620 


AAAGATAATG 


GCTGATACGC 


TTACTACACC 


11880 


AGTTACTAAC AAAGATGQTA GAAACATCTC 


11940 


AAAGCCTGAT 


CAGCAATTGG 


GAAAATATAA 


12000 


TCAATGAATG 


6T6TATAGTT 


CGCTACAATC 


12060 


GGTAAAATAC 


CAACAGTCAT 


TTCTAAACCG 


12120 


AGAGATGGTG 


TTAATGCATT 


TTGTTTCATC 


12180 


CCTTCAATAG 


CAACTTCTTG 


TTCTCCTTCT 


12240 


CTGATTGGCG 


GTAGCCATGC 


AGTAATTGCA 


12300 


CAAAAGTATA AATTCCAATG 


CGGCATTAAT 


12360 


GTTGCTGAAA 


CTGTTGAAAA 


GCCAGTCGCA 


12420 


CCTTGCTTAT 


AGACACGATT 


AGTAATCAAT 


12480 


GCCACTGCAT 


CGACAGCGGA 


TTTTCCTGGT 


12540 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


CCTACTGGAA 


TTAAGATACT 


TAATGGCATC 


12660 


TTAGCTTTAA ATAGTATTGA 


TGGACCGATT 


12720 


CCTGCAACTT TAAATAATGT AATGACCAAG 


12780 


ACTATTGGTA ACGCTGTACC AATTAAAATC 


liSo4 U 


CCTATGATTG 


AGCX5AATGGC 


TAGATGAACA 


12900 


ATCXSTAAAAG GAATAAAGAA ACATAGTATG 


12960 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 


TGAATACACA AAACTGTATG ATGCATCTTC 


13080 
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ATAGTTTGAA 
ATTACTTAAT 
GTGTTTATAA 
GTATTATTAA 
TATTGAAGGT 
ATTATATGTT 
TGGTGTCAGT 
AAAAAACATC 
GGAATGCXSAT 
AGAATGGTAA 
AATCAACACA 
CTGAACGACA 
AAAATGATCA 
CACAATCGAA 
ATCATGATAA 
AATCAACAAA 
AACCTGOGCA 
AACCACAAGT 
CAGATAAAAA 
GTTCGACTAC 
TACAACCACT 
ATAATGCACC 
AATTGAGAAA 
ATGATCGCGC 
CACATGCAGC 
ATTGGGAACC 
GTTTAATGGA 
ATGGTGGAGA 
CAACATTAGC 



TTATTTTCAT 
AAAATATTTA 
ATTATTTGGA 
ATTTTTATTA 
GCAGTTGTTT 
AAGAGGACAA 
GTTAGCGGCT 
AACTAATGCA 
AACGTCACAT 
AAGTGGAACA 
AAATAGTAAA 
AGGTTCTAAA 
GGTTCAAAAT 
TGATGTTGAT 
AGCAGCACCA 
AGCACAAGAT 
TCAAATCATA 
TGGCGATTTA 
TACTGATaAT 
AAATGCAGCA 
TAACAAATAT 
TGCTTTATAT 
GCAAGGCTAT 
TGTAGAACTT 
TAAATACGGA 
TGGTAAAAAG 
AGAGTTTTTA 
AATATCACCA 
AACACCACAT 



ACCAATACAA 
TCTTAAATGT 
AATACACATA 
ATTTTGTAGT 
TTCATTCTCA 
GAAGAAAGAA 
ACAATGTTTG 
6CG0CACAAA 
CAAATGCAGT 
GTGACAGAAG 
ACAATCAGAA 
CAGTCACACC 
ACCCATCATG 
AAATCACAAC 
ACTTCAACTA 
GCAACCACGG 
GATGCAAAGC 
AGTAAACATA 
AAACAACTAA 
GCAGATGCTA 
CCAGTTGTTT 
CCAAATTATT 
AATGTACATC 
TATTATTACA 
CATGAGCGCT 
GTACATCTTG 
AGAAATGGTA 
TTATTCACTG 
AATGGTTCAC 



ATTAACTAAT 

TGTTGTGTTG 

TTTGTAAATG 

CTTAATCmAA 

AGAGGGGGTC 

AGTATAGTAT 

TTGTGTCATC 

AAGAAACACT 

CAGGAAAGCA 

GTAAAGATAC 

CGCAAAATGA 

AAAATAATGC 

CTGAACXiTAA 

CATCCATTCC 

CACCCCOGTC 

ACAAACATCC 

AAGATGATAC 

TCGATGGTCA 

TCAAAGATGC 

AAAAGGTTCG 

TTGTACATGG 

GGGGTGGAAA 

AAGCAAGTGT 

TTAAAGGTGG 

ATGGTAAGAC 

TAGGGCATA6 

ACAAAGAAGA 

GTGGTCATAA 

AAGCAGCTGA 



TATATATAGA 

ATTCAACACC 

ATTAGTATCG 

AAATAATATA 

AAAAAAATAC 

TAGAAAGTAT 

ACATGAAGCA 

AAATCAACCX3 

ATTAGACX3AT 

GCTTCAATCA 

TAATCAAGTA 

GACTAATAAT 

TGGATCACAA 

GGCACAAAAG 

TAATGATAAA 

AAATCAACAA 

TGTTCGCCAA 

AAATTCCCCA 

GCTTCAAGCG 

ACCACTTAAA 

ATTTTTAGGA 

TAAATTTAAA 

AAGTGCATTT 

TCGCGTAGAT 

TTATAAAGGA 

TATGGGTG6T 

AATTGCCTAT 

CAATATGGTT 

TAAGTTTGGA 



TTGAAACTAT 

ACAACTAAAA 

ATTTAATATC 

TGTCATGTTA 

TTTTGAGGTG 

TCAATAGGCG 

CAAGCCTCGG 

GGAGAACAAG 

ATGCATAAAG 

TCGAAGCATC 

AAGCAAGATT 

ACTGAACGTC 

TCGACAACGT 

GTAATACCCA 

ACTGCACCTA 

GATACACATC 

AGTGAACAGA 

GAGAAACCGA 

CCTAAAACAC 

GCGAATCAAG 

TTAGTAGGCG 

GTTATC6AAG 

GGTAGTAACT 

TATGGCGCAG 

ATCATGCCTA 

CAAACAATTC 

CATAAAGCGC 

GCATCAATCA 

AATACAGAAG 



13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13600 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14880 



SS 



347 



EP 0 786 519 A2 



ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

5 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

10 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

TCATACaAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

20 GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AGTAGCATCA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGOGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACX3ATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

30 

TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16080 

35 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCX3T AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

40 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCX3CATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16380 

4S ATAAACXIATA ACTTA6TTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

50 AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA ACAATATCTT TAAAAGCAGC ATGTGGAATG GCTAAATCTT CTAAATCTGC 16800 

CATAGAAAAT TCAAGATTGA TATCATGTGG TCGCTGTTCA GCAAGTTTAT GCACAAAGTC 16860 

AGGTTCTGTG ACAAAAGGCG AAGACATGCC GACCATATCT GCATGTTGTA AAGCATCTAA 16920 

AGCAGACTCT GGAGAATTAA TCCCGCCACT TGCAATTAAA GGGATACGAC CTGCTAAATG 16980 

TTCATAGACA ATTTGGTTAA CTGGTCGACC GAAATGATCA CCTGGTGTAC GAGACGTATT 17040 

TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACOTCCAT 17100 

GACCCAATTG ATTAATTGGT TGAACTCGTC AATGGTATAT CCTAAATCAC TGCCTCTGGT 17160 

TTCTTCTGGC GTTGCTCGAA ATCCTAAAAT AAAATTGTCA GGTGCTTCTT TATCAATCAC 17220 

TTCTTGTACC GCACGCATAA CTTCTAAACA TAATCTTGCA CGATTTTTTA ATGAGTCGGC 17280 

ACC6TAATGG TCTGTACGTT TATTCGAAAA AGTTGAGAAA AATGTTTGAA TCAGCAAACG 17340 

TTGTGCAATC GAAATTTCCA CACCATCAAA ACCTGCTTTA ATCGCGCGTA ATGTAGCATC 17400 

GCGATACTGC TGAATGATGC TATTGATTTT CTCATGAGAC ATGGCGATAA CATCGTGTTC 17460 

AATCGGTGAA TGCAATGTCA TAGGGCTTGG TCCATACACC TTTCCAAAAT TTAAAATGGC 17520 

TTGATTTGAA AAACGACCAG CATGCGCTAg CTGGATAATA GCGAGGCTAC CATGTTGTTT 17580 

CATCGTAGAT GCCATGTTAG TTAATCCAGG GATACAAGCA TCATGATCAA TATTAAAGCC 17640 

ATATTCAAAC AATTGACCAT AAGGTTCAAT GTAAGCAGCG CCGGTGACTT GCATTCCAGC 17700 

TGAATTAGAG CGACGTGCAG CATAAGCCAA GTCTTCTTTT GTAATATAGC CTTCTTTTGT 17760 

TGATGTGTTT ACGGTCATTG GTGATAATAC AAAGCGATTC GAAATTTTGA TGCCATTAGG 17820 

TAAGTGGATT GATTGTAAAA GTGGTTTGTA TCX3GTACATA CTATGATTCC TTTTCTATTC 17880 

AATATTGTTT TCAAAGTACC ATGGAAAGAA TGAATAATCA ATGATGAACA GTCTTGATAG 17940 

AATAGAATT6 GTACATGGAA AGTATTTTTA AAATTAAACT AATGAATGGC ATTTGTAGGT 18000 

CTGAAAATAT GAATATGAAA AAGAAAAATA AAGGC6AAAA GATATAAAAG TTAATTGAAA 18060 

AACGTTATCA TATACGTGGG TATATGAAGA GGGAATGGTA TTAAGAACGC TAAAATGTTA 18120 

TGTCGGTTTG ACATGACAGG ATAAGTTTGG AGATGACGGA TTGGTTAAAT TAAGCGTATT 18180 

AGACTATGCC TTAATAGATG AAGGTAAGGA TGCACAAAAG GCATTGCAAG ATTCAGTOAC 18240 

ACTTGCAAAA TTAGCAGATC GACTTGGCTT TAAGCGAATT TGGTTTACGG AACATCATAA 18300 

TGTACCAGCG TTTGCGTGTA GTAGTCCAGA ACTTTTGATO ATGCATACAT TGGCGCAGAC 18360 

AAATCACATA CGAGTTGGCT CTGGTGGTGT GATGCTGCCG CACTATCQAC CTTATAAAAT 18420 

TGCTGAGCAT TTTAGAATGA TGGCAGCGTT ATATCCAAAT CGTATTGATT TAGGTATTGG 18480 
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TAGTTACGAT GAATCGATTT <X3TTATTACG TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGCGCATACG TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

TAGTAGCGCA ACATCTGCXA AAATAGCTGC CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

ATTTTTGCTA CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG ATATTTACAA 18780 

AAAACATTTC CAAGCATCAA CGATTAAAAT GGACX3CAAAG GTGATGGCAT CTGTATTTGT 18840 

CATTGTAGCT GATAACGAAG CGGAAGTAGC AGCATTACAA CATGCCTTAG ATGTTTGGTT 18900 

ATTAGGTAAA TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA CAGCACAAAA 18960 

GTATAAGCTT AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGQTACACAA GAAAAGGTTA AAGCACAATT AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TGAGGTGTTA GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

ACTCQCGGAA ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

ACCTGAATTG CAAGATGATA TTGGGACAGT AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

AGTTAAAGTG GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAAOGGTCA TTGATGTGCA 19380 

AACGCCATTG TCAGGAAC6A TTATTGAGCG AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

TTTAAACTCT GAAAAACCA6 AAGAAAATTG GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

AGCATTCCTA GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

TGAATATTTA ATCAATGATA TGCATCGAGA GAGAAATGAC AATGACGTAT TGGTAATGCC 19620 

ATCTTCATTT GAAGATTTGT GGGAATTATA TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

ACCTGTAAGT GATGAATATT TAGCTGTACA AGATGCTATG TTAAGTGATT TGAATCGTCA 19740 

ACATSTTACG GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

AGGTGATATC ACGACGTTAA AAATCQATGC TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

AGGATGTATG CAAGCTAATC ATGACTGCAT TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCAAGTTCGA CTTGATTGTG CAGAGATCAT TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

TAAAGCCAAA ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC ATACGGTTGG 20040 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG CTAAATGTTA 20100 

TCrrAGCTGT CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGTCGCTT TTTGCTGTAT 20160 

ATCTACAGGT GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG TTCGAACAGT 20220 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTTCAATG TATTTACAGA 20280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 205B0 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 20640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATGATTT AATTOGTAAA 20820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20860 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGAT6GT TGAAGATGCT 20940 

GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACTVTCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATTACAGCAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 21300 

CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 21360 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGISATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

C6CAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT C6AAAGGGAT 21780 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 22200 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

5 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 22380 

AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

10 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 22560 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACX5T 22680 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 22740 

20 CQATTAATGT TT6TCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCX3GTAAAAC AGACGATGCA 22860 

GAAGTGATGG CGAaCTGcAA TTGAA06TAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

25 GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23040 

ATATGAAAAA TCAAAATTGT TAGCTQAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 23100 

TACAACAGTA GAATTTGCCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 23160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 23220 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 23280 

35 

AATGACAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 23340 

tTTGTTGGGA ATTGcCAAAt TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 23400 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 23439 

40 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 
^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 

55 



352 



EP 0 786 519 A2 



10 



15 



20 



TATTATGCAG TCGATTTAGG GAAATCATAT OGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 

TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATCTTAG AAAATTGTCA 240 

AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 3 00 

GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 360 

CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 4 20 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 4 80 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 540 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 600 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTG6ATA AGGAGTTTTG TCATAATGAA 660 

TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTCA 720 

AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAAT6T 780 

TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 840 

TAAATTAACT GAT6AACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 

2S TTATAAATTA AATGCTGATC TATCACCX3AA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 960 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 1020 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 1080 

AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 114 0 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 1200 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 1260 

35 

GGTAGTGAAA ATATGAGCTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 1320 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 13 80 

GACTCATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGGAAA ATTCCGTTTT 1440 

CCTATGAGCG TAGCTTATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 1560 

CGACATCGCC GTAGAAQAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 1680 

AAGCCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCCAAAA 1740 

so ACACATAATA TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 1800 

CACAAAGAAA GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 1860 
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AAACAAAAAT ATGATAAATA TGTAGCTAAG ACGCAAACGT CTCAAAATAA ACAATTAGAA 1980 

CAAGAAAAAC AAAATGATAG TGTTGTCAAA CAAGGAACTG CATCTAAATC ATCTGATGAA 2040 

AATGTATCAT CAACAACAAA ATCAATGCCT AATTATTCAA AAGTTGATAA TACTATCAAA 2100 

ATTGAAAATA TTTATGCTTC ACAAATTGTT GAAOAAATTA GACX3TGAACG AGAACGTAAA 2160 

GTGCTTCAAA AGCXSTCGATT TAAAAAAGCG TTGCAACAAA AGCGTGAAGA ACATAAAAAC 2220 

GAAGAGCAAG ATGCAATACA ACGTGCAATT GATGAAATGT ATGCTAAACA AGcGGAACgC 2280 

TATGTTGGTG ATAGTTCATT AAATGATGAT AGTGACTTAA CAGATAATAG TACAGATGCT 2340 

AGTCAGCTTC ATACAAATGG CATAGAGAAT GAAACTGTAT CAAATGATGA AAATAAACAA 2400 

GCGTCAATAC AAAATGAAGA CACTAATGAC ACTCATGTAG ATGAAAGTCC ATACAATTAT 2460 

6AGGAAGTTA GTTTGAaTCA AGTATCGACA ACAAAACAAT TGTCAGATGA TGAAGTTACG 2520 

GTTTCGAATG TAACGTCTCA ACATCAATCA GCACTACAAC ATAACGTTGA AGTAAATGAT 2 5 BO 

AAAGATGAAC TAAAAAATCA ATCCAGATTA ATTGCTGATT CAGAAGAAGA TGGAGCAACXS 2640 

aATAAAGAAG AATATTCAGk AAGTCAAATC GATGATGCAG AATTTTATGA ATTAAATGAT 2700 

ACAGAAGTAG ATGAGGATAC TACTTCAAAT ATCGAAGATA ATACCAATAG AAACGCGTCT 2760 

GAAATGCATG TAGACGCTCC TAAAACGCAA GAGTACGCAG TAACTGAATC TCAAGTAAAT 2B20 

AATATCGATA AAACGGTTGA TAATGAAATT GAATTAGCAC CGCGTCATAA AAAAGATGAC 2880 

30 CAAACAAACT TAAGTGTCAA CTCATTGAAA ACQAATGATG TGAATGATAA TCATGTTGTC 2940 

GAAGATTCAA GCATGAATGA AATAGAAAAG AATAACGCAG AAATTACAGA AAATGTOCAA 3000 

AACGAAGCAG CTGAAAGTGA ACAAAATGTC GAAGAGAAAA CTATTGAAAA CGTAAATCCA 3060 

35 AAGAAACAGA CTGAAAAGGT TTCAACTTTA AGTAAAAGAC CATTTAATGT TGTCATGACG 3120 

CCATCTGATA AAAAGCGTAT GATGGATCGT AAAAAGCATT CAAAAGTCAA TGTGCCTGAA 3180 

TTAAAGCCTG TACAAAGTAA GCAAGCTGTG AGTGAAAGAA TGCCTGCGAG TCAAGCCACA 3240 

CCATCATCAA GATCTGATTC ACAAGAGTCA AATACAAATG CATATAAAAC AAATAATATG 3300 

ACATCAAACA ATGTTGaGTlA CAATCAACTT ATTGGTCATG CAGAAACAGA AAATGATTAT 3360 

CAAAATGCAC AACAATATTC AGAGCAGAAA CCTTCTGTTG aTTCAACTCA AACGGAAATA 3420 

TTTGAAGAAA GTCAAGATGA TAATCAATTG GAAAATGAGC AAGTTGATCA ATCAACTTCG 3480 

TCTTCAGTTT CAGAAGTAAG CGACATAACT GAAGAAAGCG AAGAAACAAC ACATCCAAAC 3540 

AATACTAGTG GACAACAAGA TAATGATGAT CAACAAAAAG ATTTACAGTC ATCATTTTCA 3600 

AATAAAAATG AAGATACAGC TAATGAAAAT AGACCTCGGA CGAACCAACA AGATGTTGCA 3660 
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CCAAGTGTTT 


CATTACTAGA 


AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 


3780 


GATAAAAAGA 


AAGAACTGAA 


TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 


3840 


GATGTAACTG 


AAGGTCCAAG 


TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 


3900 


GTTTCAAGAA 


TTACGGCATT 


ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 


3960 


CGTATAGAAG 


CGCCTATTCC 


AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 


4020 


CCAACGACAG 


TCAACTTACG 


TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 


4080 


AAATTAACAG 


TTGCGATGGG 


GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 


4140 


AAAACGCCAC 


ACGCACTAAT 


TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 


4200 


AGTATTTTGA 


TGTCTTTACT 


ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 


4260 


GATCCAAAAA 


TGGTTGAATT 


AGCTCCTTAT AATGGTTTQC CACATTTAGT TGCACCGGTA 


4320 


ATTACAGATG 


TCAAAGCAGC 


TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 


4380 


OGTTATAAGT 


TATTTGCACA 


TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 


4440 


CCCATATGAT 


GAAAGAATGn 


CAAAAATTGT CATTGTAaTT GATGAGTTGG CTQATTTAAT 


4500 


GATGATGGTC 


CGCAAGAAGT 


TG 


4522 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
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TCAAGTTTAC 


GGATACGTAT ATATTTTGCA TGACATTTAG 


TGCAATAATA 


TTCATAATTT 


60 


GCCCGTTGTT 


GATAGCTTTC AATGCTGTTA CAAAATCTAG 


GCGCTCCAAC 


CTGTTGGCTC 


120 


AATCGTTTAA 


AATCTTGATC TTTATGTTGA TAACCTTTAC 


CAGCAATATG 


CAAGTGATAA 


180 


TGACACAATT 


CGTGCAGTAT AATTTTTACA ACAGCATCTT 


CTCCATAATG 


CTCATATTGT 


240 


TTTGGATTAA 


TTTCAATATC ATGGGACTTT AAAAGATAAC 


GTCCGCCTGT 


TGTACGTAAC 


300 


CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA 


ATTTTTCTTC 


CGAAAGATTC 


360 


TCAACCATTC 


GCTGAAGTTT GTCATTATTC ATGTGGATCA 


ATCATCGTTA ATGATACTTT 


420 


GTCTTTATTT 


TTGTCAATAC TGTAAATCCA AACGTCAACG 


ATATCACCAA 


CACTGACAAT 


480 


ATCCATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA 


ACATGGACAA 


GTCCATCTTG 


540 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 660 
AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 720 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
^0 (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

IS 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
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TCTCCAGCTT 


TAACTTGATC TGGCACTTTA ACAATTGTCT 


GATCCATACA 






ATAACTTCGC 


ATTGATGACC ATTTACATTT ACAAAGCTAC 


CTTGCATTAT 


GC6TAAAT6G 


120 


CCATCTGCAT 


ATCCAATAgG TAACAATGCT ATTGTAGTTG 


GGTCAGTAGC 


TGTATAAGTT 


180 


GCACCATAAC 


TTACAGACTC ACCCGCTTGT AGCGTCTTTG 


TTTGAACTAC 


ATTAGCAATT 


240 




1 LK^lrl TAAG GTGTACTTTA AC l i l l lGCT 


GTACATACTC 


TGATGGATAA 


300 


TATCCATAAA 


GGGAAATTCC TGGTCTTATT GCATTACAGA 


ATTGGCAATC 


CATTAATAGA 


360 


GAGCCTGCTG 


AGTTCTGACA ATGTATATAT TCAGGTTTAA 


TTGCTTCATT 


GACCATATCT 


420 


TTAAAACGTT 


GATATTGTTC AGTTGTCATA TCTCCTGGTT 


CGTCAGCACA 


GGCAAAGTGT 


480 


GTAAACACGC 


CTTCAAATAC AAGTTGCTCA TATTGTTGAA 


TGATTTCAAT 


CACTTCTTGA 


540 


TACGTTTTAG 


TATCTTTAAT ACCTAAACGT CCCATTCCTG 


TATCTAATTT 


AATGTGCAAC 


600 


CATAACTTTT 


TCTCTTGCTC ACCAGAAATG TTTTTAATTG 


CTTCTTTCAA 


CCACTGTTTA 


660 


GACGGAACCG 


TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT 


CAATATCTTT 


AGCTGGTAAC 


720 


ACACCTAAGA 


CTAAAATTTT AGCAGTAATC CCATGCATTC 


TAAGTTCTAT 


CGCTTCATCT 


780 


AACGTTGCTA 


CAGCAAAAAA TGTGGCGCCA TTTTCCATTA 


AATGACGTGC 


TACTTTAACA 


840 


CTACCTAGTC 


CATAGGCATT GGCTTTAACG ACAGCCATCA 


CTGTTTTATT 


TGGAT6CAAT 


900 


GTACTGAATA 


CTTTGAAATT TGATGCAACA GCGTTTAAAT 


CTACATTCAT 


ATACGCAGAT 


960 


CTATAATATT 


TATCCGACAT ATTACTTCCT CCTGTAATTC 


CCACACGTTT 


TAAAACTAGA 


1020 


TCTTAATTAT 


CATTGTATAA CAAATTTAAA ATGCTGACTT 


TTCTAAAACA ACTTGG 


1076 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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(C) STRANDEDNESS: doxible 
CD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATOAGT CAATAAAGCG 60 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA QCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 240 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCAT6TTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 480 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

25 T6CAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTCA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 7 BO 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 840 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATdGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 1140 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAOTA TGGTATCACG ATGAGTATCC TCTAATTGAA 1380 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 1440 

so GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 1560 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 1680 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 1740 

S GGCTATGAAT ATAATCAACXJ TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG IBOO 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 1860 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATT06TC ATTGTTACAA AGCTGACCCA 1920 

GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG AT6AAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2040 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

IS 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 2160 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2220 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 2280 

20 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 2340 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 2400 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 2460 

25 

QTCACTTTGT TAGCGAAT6C TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2520 

GCXSGAATTTA ATAAGTACGA AGTAGTTCTG QGTATGTTTT ATAAATGTTC GATAATACAC 2580 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 2640 

GAATTAT6TG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2700 

AGGTAACTAT A7ATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 2760 

35 GTAAATAAAT ATTACXSAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 2880 

AGAC5CTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGOGTTTAG 2930 

40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

SO 

CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 60 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT 


160 




ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA 


TTTTTGGCAA 


TAACGAAGGA 


240 


5 


GTATTTATGA ATAAAGACAA GCAATTGCAC AACXW^CAAAA TCAATCTATC CCAATTAGTC 


300 




TTATTAGGGT 


TAGGCTCTTT 


AATAGGATCT 


GGTTGGCTAT 


TTGGTGCGTG 


GGAAGCATCA 


360 




TCAATAGCTG 


GACCA6CAGC 


AATCATATCA 


TGGGTTCTTG 


GATTCCTAGT 


CATTGGAACC 


420 


10 


ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 


480 




TATGCCCAGT 


ATACACATGG 


CTCATTATTA 


GGCTTTATTG 


CTGCTTGGGC 


GAATTGGGTG 


540 




TCTTTGGTGA 


CAATAATACC 


TATCGAAGCT 


GTGTCAGCTG 


TTCAATATAT 


QAGTTCTTGG 


600 


IS 


CCGTGGCATT 


GGGCGAAACC 


AATGAGATAT 


TTAATGGAAA 


ATGGCTCTAT 


TAGCACATAC 


660 




GGATTGCTAG 


CTGTATATCT 


CATCATTGTT 


ATTTTTTCAT 


TATTAAACTA 


TTGGTCCGTA 


720 


20 


AAACTTTTAA 


CATCATTTAC 


GAGTTTAATT 


TCTGTATTTA AATTAGGCGT ACCCATGTTA 


780 


ACCATCATCA 


TGTTGATGCT 


ATCAGGATTC 


GACACTTCAA 


ATTACGGCCA 


TTCGGCAAGC 


840 




ACATTTATGC 


CTTACGGAAG 


TGCACCGATT 


TTTGCTGCAA 


CAACAGCATC 


AGGGATTATT 


900 




TTTTCATTCA 


ATTCATTCCA 


GACAATTATT 


AATATGGGTT 


CAGAAATTAA 


AAATCCTGAA 


960 


AAAAATATCG 


CAAGAGGCAT 


CGCTATCTCA 


CTGTCAATCA 


GTGCAGTGTT 


GTACATCATT 


1020 




TTACAAAGTA 


CGTTTATCAC 


TTCTATGCCT 


CAATCAATGT 


TACAACATAG 


TGGATGGAAT 


1080 


30 


GGCATCAACT 


TCAATTCACC 


ATTTGCTGAT 


TTAGCTATCT 


TATTAGGAAT 


TAATTGGCTC 


1140 




GCAATTTTAC 


TATACATTGA AGCTTTTGTA 


TCACCATTCG 


GTACTGGCGT 


GTCATTTGTC 


1200 




GCCGTTACAG 


GTCGAGTTTT 


ACGAGCAATG 


GAGAAAAATG 


GACATATCCC 


TAAATTTCTT 


1260 


35 


GGGAAGATGA ATGAAAAATA 


TCATATCCCA 


CGTGTAGCAA 


TCATCTTTAA 


TGCCATCATT 


1320 




AGTJdXy^TTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT 


1380 




6CAACTTTAG TAGCCTATTT AACTGGCCCA ACGACAGTGA TTOCATTAAG AAAAATGGGA 


1440 


40 


CCAACAATGA 


CTCGTCCATT 


TAGAGCAAAA 


ATTTTAAAAG 


TAATGGCACC 


ATTATCATTT 


1500 




6TATTAGCTT 


CATTAGCTAT 


ATATTGGGCA ATGTGGCCAA 


CAACGGCT6A 


AGTTATTTTA 


1560 




ATCATTATAC 


TTGGATTACC 


AATCTACTTC 


TTCTATGAAT 


ATCGTATGAA 


TTGGCGTAAT 


1620 


45 


ACAAAGAAAC 


AAATTGGTGG 


TAGCTTATGG 


ATTATTGTAT 


ATTTAATCGT 


GCTATCAATA 


1680 




CTGTCATTTA 


TAGGAAGCAA 


AGAATTTAAA 


GGCTTAAATA 


TGATTCACTA 


TCCATTTGAC 


1740 




TTTATCGTTA 


TTATTATTGT 


GGCACTTATC 


TTCTATTACA 


TCGGTACAAC 


GAGTTCATTT 


1800 


SO 


GAAAGCGTCT 


ATTTCCGTCG 


CGCAACACGA ATCAATACGA AGATGCGTGA GTCACTAAAT 


1860 
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(i) SEQUENCE CHARACTERISTICS: 

ih) LENGTH: 15109 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACsAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 240 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 480 

TATA5CTAGA AAGTTAGATA TTTGTAnTT TTTAAATAAT AAGTGCCGTT GTTATCGTTC 540 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 660 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAG6TTTA AAAGGATTAC AACAGAATGT 720 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 780 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 840 

GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTOCbTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATQG ACGCATTGCC AATTTTAAAT 1080 

GAGCCAGTCA AACATGTTGC TTCAAATGTO AGCQACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 1260 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 1320 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG 1380 

ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 
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20 
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GTACCACCAT TAACACACTT TATAACAAAT 
ACTGCTTATA TTGGTAACAA GGACTTGGTT 

5 AGTTATGGAC CAAGTCAAGG TAAGACATTT 

ATAAATCATG TAGATAACAA CGCATTTTTT 
AATTAACAGC TGTGTAGAAT AATTAAGGTT 

10 TTTTCATTTC TTAAAGTTTA CAATGGTGCT 

TAAAAAATGA CAACAAAACA GTTAGTATAT 

TTAGGATTGG TACCGGTAAT TCCACTACCA 

75 

ATTGGTATTT TCTTAGCAGG TGCGATTTTA 

GTCTTTTTAT TATTAGTAGT TGCTGGCTTG 

GGTGTATTCG CAGGTCCTTC AGCAGGGTTT 

20 

ATTGGGGCGA TTCGAGATAG ATTCATCAAT 
ATTTTAGTTT TTGGTGTTAT AGCATTAGAT 
ATTAACATAC CATTTACGAA AGCTATTTCA 

25 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT 
CAAATTATGG GAATAAAATA ATCATATTTA 
GAAATTTATA AAAGTGAAAG GAGTAGGTGT 

30 

ATTGTAACGG CACTATATTT GAAAATGACG 
TTCCACAACG ATACACAAGT AACACATGGA 
GGCTGAAAGA CTTAGAACGT CAACATCAAT 

35 

CGTTTAGTTT CCCGGAAAAT GAACAACTTG 
TGAATTTTGA ACTAGGTATT ATGGAATTGT 
TGCCGCGTAA CTCTGACGTT GAAATTGCCA 

40 

TAAAAGTTGC ATATCAGTTT AGTTTGCCAT 
AAATGGTAAG GGAACATTAT CAAAAAGATG 

45 ATGAACCTAT TGGCGTTGTA GATGTCATTG 
TTGGTGTATT AGAACAATTT CGGCACCAAG 
GTGAATACGC CATATCAAAA AATCACAAAC 

SO CAGCAAAAGA TATGTATGCA AAGCAAGGTT 
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TCTACTTACT TTTTATGGGA TGTTTTAACG 1560 

CATTCAATTG AGAAAAAAGT CGATGTAATA 1620 

GAGTGTAAAG ATGGGCGCAA AATTAATGTC 1680 

GATTATATAA CTGCACTTGC TAAAAAAGTA 1740 

TTAATTTATA TAGAACAACT TATTGTAAAC 1800 

ATAATAATGG TCATGAAATA CGAAAGGAAG 1860 

ACAGCTTTAA TGACAGCGAT TATCGCTATT 1920 

TTTTCTTCAG TACCAATTGT ACTTCAAAAC 1980 

GGACGTAAAT ATGGCACATT AAGTGTTATC 204 0 

CCATTGTTAT CAGGTGGTCG CGGTGGCATC 2100 

TTACTATTAT ATCCAGTTGT AGCATTCATG 2160 

GAAATTAATT TCTGGATTTT ATTCGTTGGT 2220 

GTTATTGGTA CATTGATTAT GGGCATGATT 2280 

ATTTCATTAG CTTATTTGCC TGGTGATATA 2340 

ACAGCTTTAC TTAATCACTC GCAGTTTCGT 2400 

AGATAGTAAA GTAATTGAAT AAGTTGCTTT 2460 

CAATGGCTAG TATAAGTATG TCA6ATATAT 2520 

ACGAGCAGTT GATTTATTTA ACGCCTTCTT 2580 

TATATAAAAA GACGCCTACC CAAGAGCGAT 2640 

TACATACAAA TCAAGGTTCA AATCATTATG 2700 

ATAATCATTG GATGGCTATG TTTAAAGATA 2760 

ATGCCATAGA AAGTGATGCX3 CTTGCCAATT 2820 

TCGTTGACGA GTCGCATATA GATGCCTATT 2880 

TTGGAAAAGA CTATGCAGAT GCACATGAAG 2940 

TGATTAAACG CTTAGTAGCT TATTTAAATA 3000 

AAAGTGAAAA TTACATTGAA TTAGATGGAT 3060 

GAATTGGATC TACAATTCAA TCGTTGATAG 3120 

CAATCATATT AGTTGCAGAT GGTGAAGATA 3180 

ATGTCTATCA ATCGTTTTGT TATCAAATAT 3240 
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10 



IS 



20 



2S 



30 



40 



4S 



SO 



TAAGCTGGTT 
CTTATAAAAA 
CTATAATGAA 
AAGCCACCAT 
GCAAGCGTTG 
ATTAAAATTT 
TGTCGCTGAA 
CGATGTACAA 
TGGTAAATAT 
AAA6TGATTT 
CGACTCTTTC 
AAATCTTTAT 
AATTCATCTC 
AGGTTTGTT6 
GGAAATTCTG 
ATTGGTTGAC 
TTATTTTAAA 
ATAAAAATTA 
GGGCATCAAT 
AAACAGGCAA 
TIAMAATGG 
CTTTTAGTAT 
6CAGAGAATG 
ACCAAGAAAT 
TATGGTGGAT 
ATGTCTGCAA 
TATAAATATG 
GTAAAAGACT 
GGAAAATATG 



TCGAGTAGAA 
TAGTGCOTTT 
TATGTTATTG 
TTCGAATAAA 
TCGGAGCTGA 
CTGATGAAAT 
TACAAAATCC 
AAAATGTCAA 
GACTTGCACT 
TAGACATAGA 
CGCAACAAGA 
TAAGTTCAAC 
GCTTTAAAAT 
GACTGCATAT 
TCACATATTG 
CTAAAGATAC 
GCAATTTGCT 
ATAGCGAGCA 
TAATAAATAT 
CGGTTCTTTT 
GGGTTCACTC 
TAGCTGGTTG 
GTAAGAAACA 
TAGCTTCAGA 
CAGGGGCATT 
ATACTAAAGA 
CGAAAAATAG 
TAAAAGACAA 
CGAAACAGTA 



ATCAACTTAC 
AAATTGTTGA 
TTCAGAATCA 
TCCAACTGCC 
TTTAGATAAA 
ACGTCCACTA 
ATATAATTTA 
ACCATCGCTT 
TTGTAATCGA 
TGTTTTAGCG 
TGCAATCATT 
ATGGGCAAAA 
GGCACCTTCC 
AACAGTCGCA 
ATCTGTTGTA 
ATCTTTGTTC 
TAAAATTTTA 
ATCTGTTTGA 
CAATCTTATG 
CAAATATAAT 
AATGAAATTG 
CTCTAATTCT 
AGAGATTCAA 
ATTTAAAAAA 
AAGAAAACAA 
TGTAGATGCA 
TCTAGTATTA 
TGATAAATTA 
TTTAGATAAC 



TGCTTTTTAA 
TTCATGTAGA 
ATGATACGTT 
GTAATATTTA 
ATGACGCCAA 
AAAATTAATA 
TCTAGAGCGT 
ATA6CAGCAT 
GTCATCATGT 
ATAGCAGCAT 
CGTTTTGTGG 
CCTTTACTAT 
GAAGCCAGAA 
AATTCTTCAC 
TTGAATAATT 
ATTATCTAAC 
ACATATTTGC 
TTTAAATTGA 
CAAATTTGAC 
AGTAAGTGTA 
AAAC6TTTAT 
AACGATAATA 
GTTGCAGCGG 
GAGCATAAAA 
ATTGAATCAO 
TTAAAAGACA 
ATTGGTGATA 
GCATTAGGTG 
AATAACTTAT 



ATTGTTTTGA 
ATATCGTTCA 
CTGGATGACT 
GGTCATTAGC 
CACCAATTTT 
CTTTATCTCG 
TATGTCTACC 
TATGTAAGCC 
TAATAATTTG 
CATTTTGAAA 
AATATTGAAA 
CATCAATCAG 
ATCCAATGAC 
CATTCACCAT 
TTCCATCTTC 
CCCTTTAATT 
TTAAGTTTGA 
ATTCGAGAAT 
AATTGTTTGA 
TAATGAAAAT 
TTGCTGTTGT 
ATGAAAGTAA 
CAGCAAGTTT 
ATGCTGATAT 
GCGCACCTGT 
AGAATAAAGC 
AAGATTCAAA 
AAGTGAAAAC 
TTAAAGAAGT 



GCTACTTATA 

TTATGACACA 

GTATATATTA 

TAAGGTTACA 

TGCGGCTTTA 

GACAGTAATA 

AATGTCTTGT 

ACCTGTTTCT 

CATTGGAGTT 

ATAAAACTCA 

GCGATCGCCT 

TACAGATTTT 

TAACTCCTCA 

AATTGTAAGT 

ATATCTAACA 

AGCTTAAACT 

AATTTGATTG 

ATACATACTA 

ATCAATATAT 

GTAAATATTA 

GATTGCAATG 

AAAAGATGAC 

AACAGATGTA 

TAAATTTAAC 

TGACGTATTT 

GCATGATACA 

TTACACTTCA 

TGTACCAGCA 

CGAAAGTAAA 



3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

5 AGTAAATTAG CAAAAGAGTG GATQGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 5400 

ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 5460 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

IS 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 5640 

GCAAGGCXTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

25 TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG PATGOGCMT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 6240 

35 

CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

40 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

^ ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6720 

SO CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6840 
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10 



IS 



GAGCAGGCAT TGCCAAACTA ATCATTGTT6 ATAGAGATTA TATTGAATTT AGTAATTTAC 6960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG 7080 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 7140 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 7200 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTTATACCT GGTAAAACAC 7260 

CTTGCTTTAA CTGTTTGGTA CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 7320 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 7380 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG 6AAGGTAGTC 7440 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

20 CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATQC AACATTGTGT GGTAGAGACA 7560 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT TGAATTTAAA GGACACCGCA 7680 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATQGCAT GACACGCACA TCAGATGCCA 7740 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7800 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7860 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7920 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCCA 7980 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCATCATT ACGACTGGTG 8040 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGAtAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 8280 

ATCTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 8340 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 8400 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATITC 8460 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 8580 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 8640 
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AATGCTTGAA TGAGCGACAG CAC3TTCTTTT 
TTTGGCGTGG CCTTGTTGAT TAATATGAGT 

5 

TCTAGTATAT CATGAAAAAA TAAAAGTTTT 
GAAACCCAAT CCCAGTTAAA GAAGCAATTC 
CGGCAATTAC GGTAGCACTT GAAAAAAGTC 

10 

CTACTTATGA TATACCAAGG TTTGATAAAT 
TTGATTCACA AGGGGCAAGT GGTCAGAATC 
GTGCA6GTTC AGTTTCTGAT AAATTAGTTG 
GAGCACAAAT ACCTAATGGC GCAGATGCTG 
AAGATACATT TACAATTCX3T AAACCATTTT 

20 AAGAAACAAA GACAGGCGAT GTTGTTCTAA 

TCGCGGTCCT TGCAACATAT GGCTATGCAG 
CTGTTATTGC AACAGGAAGC GAATTATTAG 

25 TTCGTAACTC TAATGGCCCA ATGATTCGTG 

GTATTTACAA AACACAAAAA GATGATTTAG 

TGGAAAAACA TGATATCGTT ATTACAACGG 

30 

TACCTGAGAT TTATAAGGCT GTAAAGGCGG 

CTGGTAGCGT AACAACGGTT GCATTTGTAG 

ATCCATCAGC TTGTTTTACA GGATTTGAAC 

35 

GTGGCGCACT AGAAGTCTTC CCGCAAATAA 
AGG»AACCC ATTCACACGA TTTATACGTG 
CTGTAGTACC TTCAGGATTC AATAAATCAG 

40 

GTATGGTCAT GTTACCAGGA GGGTCACX5TG 
TATTGACTGA ATCTGACGCT GCTGAA6AGG 

45 TTACAAAAAG TCTGGTAAGA CAACATTGAT 

TGGTTATACA GTTGCTACTA TTAAACATCA 
GGATTCAGAC GTCGATCACA TGAAGCATTT 

SO AGGTTTTCAA TATCAGCAAA CTGTAACACG 
TGAAAAATCT GTTACAATTG ACACCAATAT 
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TGTAATTTGT 


TTGTCTGATA CATCGACCAT 


8760 


AAACTCAGTC 


ATTTTACCCC TCCTAGTGCA 


8820 


GGAGATGATT 


TTTAATGGTA GTAGAAAAAA 


8830 


AACGTATCGT 


TAATCAGCAG AGTTCAATGC 


8940 


TAAATCATAT 


CTTAGCAGAA GATATTGTAG 


9000 


CACCTTATGA 


TGGTTTTGCA ATTCGCAGTG 


9060 


GCATTGAGTT 


TAAAGTGATT GATCATATTG 


9120 


GGGATCACGA A6CGGTGCX3T ATTATGACTG 


9180 


TTGTTATGTT 


TGAACAAACG ATTGAACTAG 


9240 


CAAAAAATGA AAATATATCT TTAAAAGGTG 


9300 


AAAAAGGACA AGTAATTAAT CCAGGGGCTA 


9360 


AGGTTAAAGT 


TATTAAGCAA CCX3AGTGTCG 


9420 


ATGTTAATGA T6TATTAGAA GATGGGAAAA 


9480 


CCTTAGCAGA AAAATTAGGT CTTGAAGTTG 


9540 


ATAGTGGCAT 


CCAAGTCGTT AAAGAAGCTA 


9600 


GCGGAGTTTC 


TGTTGGAGAT TTTGACTATT 


9660 


AAGTGTTATT 


TAATAAAGTA GCAATGCGTC 


9720 


ATGGaAAGTA 


TTTGTTTGGa TTATCTGGAA 


9780 


TATTTGTGAA 


nCCAGCTGTT AAACATATGT 


9840 


TTAAAGCAAC 


ATTAATGGAA GATTTTACCA 


9900 


CTAAAGCAAC 


GTTAACAAGT GCTGGAGCTA 


9960 


GTGCGGTTGT 


AGCGATTGCA CATGCTAACT 


10020 


GTTTTAAAGC 


GGGGCATACA GTAGATATTA 


10080 


AACTTCTTTT ATGATTTTAC AAATTGTAGG 


10140 


GAGGCATATT 


GTCTCTTTCT TAAAGTCACA 


10200 


TGGGCATGGT AAGGAAGATA TTCAATTACA 


10260 


TGAAGCGGGG 


GCAQATCAAA GTATTGTACA 


10320 


TGTAGATAAT 


CAAAATCTTA CTCAAATTAT 


10380 


CGTATTAGTT 


GAAGGCTTTA AAAATGCTGA 


10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA 
GTTATTAAAT AAAATTAAAA ATGATTGTGA 
^ TGAAACAATT TGAAATCGTG ACAGAACCGA 

TAAATGAATA TCAAGGTGCA GTAGTTGTTT 
GCGTCAAAAC GGAATATTTA GAATATGAAG 

10 

CACAAATTGG AGATGAAATA AATGAAAAAT 
GAATAGGGCC ATTACAAATT TCAGATATCG 
GTAAAGATGC CTATCGAGCA AATGAATATG 

IS 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT 
ATGAAGAAGC AAAGAGGGA6 GAATAAGAGA 
AGATATATTA CAAAAAGCAC AGGAAGATAT 

20 

ATTTGAAGAT TTATTGTTTG AACGTTATCC 
TGTAAATGAG GAATTTGTAC AAAAATCX^GA 

2s AATTCCACCG GTTAGTGGAG GTTAAGGGAG 
TTCAGTGCGA TTTGGTAAGC CCAAAGCTTT 
TAGAGTAATT AAGACATTAG AATCAACAAA 

30 TGCGCAATTG GCAACGCAAT TTAAATATCC 

TGATAAAGGT CCATTAGCAG GAATTTATAC 

GTTTTTTGTC GTTTCTGTTG ATACACCAAT 

3S ■ 

TCAGTTTTTA GTTTCTCATC TTATTGAAAA 

TGGACGTTTT ATTCCAACAA TTGCATTTTA 

AGCACTACAT TCTGATAATT ACAGTTTTAA 

40 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC 
TGATTTGGAC GCTTTAATTC AAAAATTGTA 
AAATAAAAGA TAAACTAGGA CGTCCCATCC 

4S 

GTAACTTTAG GTGTGATTAT TGCATGCCTA 
TACCTAAAAA TGAACTTTTA ACGTTTGATG 
AATTAGGTGT AAAAAAAATA CGCATTACAG 
ATGTACTTAT AGCTAAATTA AATCAAATCG 



GCATGAAGAT TTTACAGCAT TTGAGCAATG 10560 

TACACAATTA ACATAGAGGA TTGAAATGAA 10620 

TACAAACAGA ACAATATCGT GAATTCACTA 10680 

TTACCGGTCA TGTTCGCGAA TGGACTAAAG 10740 

CGTATATTCC AATGGCTGAA AAGAAATTGG 10800 

GGCCTGGAAC GATAACGAGT ATTGTTCATA 10860 

CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

CAATTGAGCG TATAAAAGAA ATTGTTCCGA 10960 

CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

GATGAAGGTA CTTTACTTCG CAGAAATTAA 11100 

TGTGCTTGAA CAAGCATTGA CTGTACAACA 11160 

GCAAATCAAT AATAAAAAGT TTCAAGTTGC 11220 

TTTCATTCAA CCTAATGATA CTGTTGCATT 11280 

CATGAAAGCA ATAATTCTTG CAGGTGGTCA 11340 

TGCGGAAGTG AACGGT6AGA CCTTTTATAG 11400 

TATGTTCAAT GAAATTATTA TTAGTACAAA 11460 

AAATGTTGTT ATAGATGATG AGAATCATAA 11520 

AATCATGAAG CAACATCCTG AAGAAGAATT 11580 

GATTACTGGT AAAGCTGTAA GCACGTTGTA 11640 

TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AAATGTATAT CATGAATTAT CAACGGATTA 11820 

ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

AGCTGTTAGG AGGTCCACAA ATGGTAGAAC 11940 

GTGACTTAOG GTTATCTGTG ACAGATCGGT 12000 

AAGAGGTATT TGGAGATGAT TTCGTATTTT 12060 

AAATGGCTAG AATCGCTAAG GTATATGCAG 12120 

GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGGTATTGA AGATATTGGT TTGACTACAA 12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC TATTTCAATC AATCAATAAT CGTAATATTA 12360 

AAGCGACTAC GATTTTAGAA CAAATTGATT ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

^ TAAATGTTGT TATACAAAAA GGTATTAACG ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TTAAAGATAA ACATATAGAG ATTCGATTTA TAGAATTTAT GGATGTTGGT AATGATAATG 1254 0 

GATGGGATTT CAGTAAAGTT GTAACTAAAG ATGAAATGCT TACAATGATA GAGCAGCACT 12600 

10 

TTGAAATCGA TCCTGTAGAA CXZAAAATATT TTGGGGAAGT AGCAAAATAT TATCGCCATA 12660 

AGGATAATGG TGTTCAATTT GGTTTGATTA CAAGTGTTTC ACAATCATTT TGTTCTACAT 12720 

GTACACGCGC AAGGCTGTCA TCAGATGGGA AGTTTTACGO ATGTTTATTT GCAACTGTCG 12780 

75 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12840 

AATTTAAAGC TTTATGGCAA ATAAGAGATG ATCGATATTC AGATGAGAGA ACTGCTCAAA 12900 

CAGTTGCCAA TCGTCAACGT AAAAAGATAA ACATGAATTA TATTGGTGGT TAATGTGTAG 12960 

GGACCACTAC ATATTAAATC ATTAGAGATG TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13020 

ACAATATTAT TTATTAAAGT AAAAACGGTC ATATCTATGC CAGATTTAAT AGAAATGATC 13080 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC CCCAACACAG AAGCTGACAG AAAGTCAGCT 13140 

TACAATAATG TGCAAGTTGG CGGGGCCCCA ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

GACAATGCAA GTTGGGGAAC GGGGCCCCAA CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

30 TAATGTGCAA GTTGGCGGGG CCCCAACATA GAGAATTTCA AAAGAAATTC TACAGACAAT 13320 

GCAAGTTGGG GATCAACGAA ATAAATTTTA TGAGAATATC ATTTCTATCC CACTCTTAAG 13380 

AATCACTACA TAATAAATCT TTAGTGGTTC TTTAACATTG ATGTCACACT CCATGCCATT 13440 

^ GAGTTGTAAT ATATCTTTTT TAGGTATAAA TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AGATAXAAAT CTAAACAAGA TATAGCCAGC AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

AGTTTGATAT ATAATAAATT TAAGTAATTG TATAATAATA TGAATTACAA ACATCTAAGA 13620 

40 

AGAAACATAG GAGGCATCAT ATTATGAGTA ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 13680 

GGGAGTTAAG TCAGTTAAAG CACTGGTTAA AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

TTGTAGTCCT TTTTAAAGTG TATGAAGCTG AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13800 

45 

CATTACATTT TGAAATGCTA TGGGATACAA GTAAAATCGA TGTGATTATC CGTAAAaTCT 13860 

ATAAAAAAGA GCTTATTTCT AAATTGCGTT CTGAAACGGA TGAAAGACAA GTATTCTATT 13920 

TCTATAGTAC TTCTCAAAAG AAATTGTTAG ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

GCGTTACAAA CTAAAAACTT aAAAAgcaTG CCAATCTCTA TTCATCATAA TTGCGTCTTG 14040 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 14160 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14220 

5 CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 14280 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14340 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 14400 

GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 14460 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14520 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14580 

IS 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14640 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14700 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14760 

20 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGTAAT 14820 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14880 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14 940 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15000 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

30 TAATCGTTTA GGTCCrATTT SATTTACAAA TTTACCTGTA GCAAATCGA 15109 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 
^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



45 



SO 



GAGAGTCAAT 


GGCAAGAAGA 


ATATAAATAT 


TTGAGAGCGT TAATCTTTAA TGAAACAGAA 


60 


TTAGAGGAAG 


CGTATAAATG 


GATGCATCCT 


TGTTACACGT TGAATAATAA AAATGTAGTA 


120 


CTTATCCATG 


GCTTCAAAAA 


TTATGTTGCA 


CTATTATTTC ATAAAGGTGC CATTTTGGAG 


180 


GATAAATATC 


ATACACTCAT 


TCAACAGACT 


GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 


240 


TTTGAAAATT 


TAACAGAGAT 


TCAAGCACGT 


ACCGAAGAAA TTAAATATTA TCTAGCCGAA 


300 


GCAATTAAAG 


CTGAAAAAGC 


TGGTAAAAAA 


6TTGAAATGA AGAAAACAGA GGAATATGTT 


360 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgfT 4 BO 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 540 

5 CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 600 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGGAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 720 

10 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 780 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 840 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

75 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT AOGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGC6CCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 1140 

TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

25 TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 1380 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 1440 

TT6CGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 1500 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 1560 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

GCAC^CTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 1680 

CAAGATATTA TTTrAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

40 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 1800 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTAT6TC AGAAGGTAAG 1860 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

45 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2040 

ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA 6TAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 2160 
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ATTTTAAAAA GAAACX3TTAG GAATGTACCT 
CTGATTACGC GTGCCAATGT TGTTGATATT 
^ GATACAGTGC AAACAGAACA TGTGGGGGAA 

CACACTACTA ATGTCAAAGT ACGTGACATA 
ATGAACATGG TGGACAGTTG ATGTCGAAAA 

10 

CATTATTACT TGCCATCATT GTTGCAGTAC 
GAACTGCCAA TATTGTATTA ACTGTGGCAG 
TACTTGCTAT TATGATACCG ATTTTTGGTG 

75 

TTATTTATGT ATTATTACCT ATTTTAAATA 
GCAACATTAA AGAAGCTGGA AAAAGTATGG 
TTGAATTGCC GTTAGCATTG CCGCTTATCA 

20 

TAATTAGTTG GGCTACACTT GCAAGTTATG 
TCAATGGTTT AAATTTATAT GATCCACTGA 

25 CACTAGCATT AGGT6TTGAT GCCTTATTAG 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT 
GTCGTGTTTG TCTTATCX5CT TACCGTATTA 

30 AAGAGCACGA AAAATGATGT CAAAATTACA 

TCACATATGT TACXX3TTGTT AATAGAGCAT 
GTAAATAATT TAGGGTCAAG TACGATTCAA 
ATATCAGGTG TTAGATATAA TGGCACAGAT 
AAAAATCCTA AGAAAGCAAT GATAGCAACA 
ACGtTTTTTG ATTCGTATGG TTTTGCGAAT 

40 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA 
TTAGGTATGG ATAGTTCATG GATGAATCGT 
GAGTATGGTT TTGACTTTGG TACAGTGAGA 

45 

TTAAACTCAG AGAAGTTAGA CGTTGCATTA 
TATGATTTGA AAGTACTTAA AQATGATAAA 
GTTGCAACAA ATGAATTATT ACGGCAACAC 
ACAGGAAAGA TTTCGACTTC AGAGATGCAA 



GTCGTAGATG ATCAACAGCG TTTAGTAGGA 2280 

GTATATGACA CGATTTGGGG CGATAGTGAG 2340 

GACAcTGCGT CCTCAAAAGT GCATGAGCAA 2400 

GGAGATGATA AATCATGATT GAGTTCCTAC 2460 

CACTGGAACA TTTCTATATT TCTATAGTGG 2520 

CTATAGGCAT TTTATTATCA AAAACAAAGC 2580 

GTGTCTTACA AACTATTCCA ACACTAGCTG 2640 

TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

ACACGGTACT C3GGTGTTCAA AATATTGATA 2760 

GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

TTGGTOGCAT TCGTTTGTCA TCTGTGTATG 2860 

TAGGTGCGGG TGGATTAGGT GATTTCATTT 2940 

TGATTGTAAC TGCAACGGTA CTCGTTACTG 3000 

CTTTAQTTGA AAAATGGGTA GTTCCCAAAG 3060 

AAGATAATGA AGAAAATTAA ATATATACTT 3120 

TCTGGATGTA GTTTGCCCGG ACTAGGTAGT 3180 

GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

GATACACACG GTAAGATAAA GCCAACATTA 3300 

CATAATGCCT TAATTAATGG GGATGCTAAT 3360 

TTAACGGGAG CTTTGAAGGA AGCACCAATT 3420 

CAACAAGGAT TTAAAAAGAA ATTTGATCAA 3480 

ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

GATTTAGCAA AGCATAGTAA AGATTTACGT 3600 

AAAGGCXaATG GCTATGAAGG ATTTAAAAAA 3660 

CCAATGCAAA TAGGTCTAGT CTACGACGCA 3720 

GGTTATTCTA CAGATGGTCG AATTGCGGCG 3780 

CAATTTTTCC CACCTTATGC TGCGAGTGCT 3840 

CCAGAACTTA AAACGACGAT TAATAAGTTG 3900 

CGCTTGAATT ATGAAGCX3GA TGGTAAAGGT 3960 
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AAAGGTGGTC ATAAGTAATG GAAGGTAATT TATTACAGCA ATTATTCAAT TATTATGTTA 4080 

CGAACTTTGG TTATCTATGG GATTTATTTT TCAAACACTT ATTAATGTCT GTCTATGGTG 4140 

^ TGCTGTTTGC AgCTTTAATT GGTATTCCAT TGGGAATCTT GCTTGCaAGA TACACAAAAC 4200 

TTTCTGGATT TGTAATTACA ATTGCAAATA TAATTCAAAC AGTTCCAGTC ATTGCAATGT 4260 

TAGCTATTTT AATGTTAGTC ATGGGCTTAG GTTCAGAAAC AGTAGTTTTA ACAGTGTTTT 4320 

W 

TATATGCGTT ACTTCCAATT ATAAAAAACA CTTATACTGG TATAGCTAGT GTTGATGCGA 4380 

ATATTAAGGA TGCTGGCAAA GGTATGGGAA TGACACGCAA TCAAGTGCTA CGAATGATTG 4440 

AATTACCGTT ATCTGTTTCG GTTATTATCG GTGGCATTCG TATTGCCTTG GTTGTTGCX3A 4500 

TAGGTGTTGT TGCCGTTGGA TCATTTATAG GAGCACCTAC GCTTGGTGAC ATTGTGATTC 4560 

GTGGTACAAA TGC6ACGGAT GGCACAACGT ITATTTTAGC AGGTGCX3ATT CCGATTGCTA 4620 

2^ TCATTGCAAT CGTCATTGAT GTACTATTAA GATTTTTAGA AAAAC3GATTA GACCCAACAA 4680 

CACGACATCG TAAAAATCAA TCTAATCATC GGCCGCAAAG TATTAATATG TAATAGTAGA 4740 

AGATGTTTAT AATTTAGCGA TTTCGTTTCA TGATTTATAA AAAATGAOOC TACTCAAGGA 4800 

2S GCTCAAATAA TCTTTGA6TA GCCTTTTTAT AGGTTGTGTT TGTATGCGTT TACACTAAAA 4860 

TAGCAATTAT TATCATGAAA GTTTTTGGAT AAAAAGCGTT AATTATTGTA AAAATACTAA 4920 

AAAATGAGAT GTTTTATTTA TAATTTTCTG CAAATTTATG ATATTGTTTC TTAATATATC 4980 

^ ATATTAAAAA TTTGTTTTTC TTAAACATAG GAGGCTTATC TAATTCATGG ACACATCAAA 5040 

ACAATTTAGA GGTGACAACC GATTGCTTTT GGGTATCGTT TTAGGGGTTA TTACCTTTTG 5100 

GCTATTCGCG CAGTCACTTG TTAATCTTGT TGTCCCATTA CAATCAACAT ATAGTAGTGA 5160 

35 

CGTTGGAACG ATAAATATCG CTGTTAGCTT ATCTGCCTTA TTTGCTGGTT TGTTTATCGT 5220 

AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 5280 

ATTAAATGTT GTAGGTTCAT TACTCATCAT CATTACACCT TTGCCAGCAT TTTTAATTAT 5340 

40 

AGGTAGAATA ATTCAAGGTT TGTCTGCAGC ATGTATTATG CCATCAACAC TTGCTATTAT 5400 

TAACGAATAT TATATTGGTA CAAGAAGACA ACGTGCCTTA AGCTATTGGT CTATTGGTTC 5460 

TTGGGGTGGT AGTGGTATTT GTAC6TTGTT TGGTGGCTTA ATGGCTACAT ATATAGGTTG 5520 

45 

GCGTTCAATA TTTGTTGTTT CAATTCTATT AACATTATTA GCAATGTACT TAATCAAACA 5580 

TOCACCTGAG ACTAAAGCAG AACCAATCAA AGGTATGAAA GCAGAAGCTA AAAAGTTTGA 5640 

so CGTTATTGGT TTAGTCATTT TAGTAGTGAC GATGTTAAGT TTAAATGTAA TCATCACACA 5700 

GACGTCTCAT TTTGGTTTAG TTTCACCXSTT AATTCTAGGT TTAATTGTTG TCTTTATCTG 5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 5880 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 5940 

5 TTCGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 6120 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 6180 

AACTGATACA GCAGTTGCTA GTGCGCCAGA TGATAAGTCX3 GGTGTTGCTT CAGGTGTGTA 6240 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CX5GTTTATAC 6300 

75 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC ACAGGTATGA T6TTTAATGC 6360 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA CTTCXTTAAAA ATCAAACX3AA 6420 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 6480 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATOTTAG GTGTGTTCTT 6540 

TTATAGACGA TAAAAGCTGT GTOCATATTA AGCX3AATGAT TTTCAAATTG ACGCTAATAT 6600 

25 GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 6660 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTO^GCAGG 6720 

ATATGAT6TC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AGCAACATGG 6780 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAGTTAAAC ATTCCGATGT ATCATTTTAA 6840 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6900 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT GAAACGATCG TCGTATGTAC 6960 

GATGAATGGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7020 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

TAGTCGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

40 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 7200 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG 7260 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 7440 

AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 7500 

SO 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 7560 

55 



373 



EP 0 786 519 A2 



CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 7680 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 7740 

^ TTCAGATTTG TCATAGGcTA CGACATACTA TTAGTATTTA CTAGACAGTT TTTACGACGA 7800 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 7860 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG TGTTTATAAT GAGTAAAATT 7920 

10 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 7980 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 8040 

AATGTAAAAG CATATATTGG TGATATATTA AAAGCTGATA CTATTGATCA AGCGTTAGCA 8X00 

75 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 8160 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCXXSC GAAAAAGCAT 8220 

20 GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACCTGG CX3AAGGATTA 8280 

GCAAATCAGG AAACTTCACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 8340 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 8400 

25 GGCTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GCATGATTTA TAATCAATTT 8460 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 8520 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8580 

^ GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 8640 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG TAAGCAATGA GAAGTTTAAA 8700 

GCGCAAGGTG GTACTCTGAT TTATCAAACT TGGAAAQATG GCATGAATCC AATTAAATAA 8760 

35 

TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 8820 

TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 8880 

ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 8940 

40 

TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 9000 

ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT ACATAATGAT AACCAACTCA 9060 

ATACAGCAAT CC 9072 

45 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 
SO (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT 60 

TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 240 

TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 360 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 420 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATXTTA AATATTATAG 4 BO 

GTrrAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 540 

TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGT6TGGA TGGCCATATC 660 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

2s GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 780 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATCAT CAGATTCTTG B40 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 1140 

AAG^fflMTCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATO TTTGAGTTTT 1380 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 1440 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 
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AAGCGATTAT TCGATGTAGT GAGTTCAATA 
TTAATTACAG CATTACTAAT TAAAATGGAa 
^ AGACCX3ACGA TTAATAATGA ATTGTTTAAT 

ACACCTAATG TTGCAACTGA TTTAATGGAT 
GTCATTCGTA AGACCTCTAT TGATGAATTG 

10 

ATGTCAATTG TAGGTCCTAG ACCAGCGCTT 
ACAAAAGCGA ACGTGCATAC GATTAGACCA 
AGAGATGATA TCACTGATGA TCAAAAAGTA 

75 

TCTATGATGC TTGATATGTA TATCATATAT 
GGTGTGCATC ACTAATGAGA AAAAATATTT 
ATGCTTTAAA AGATAAGCTT ATTGAACAAG 
ATCAATTATG GAAGTCGACC TCGTTCAAAG 
TGGTTCACAA CAATTCACCT CAAGCAAGGC 

25 TGACqAAACA ATTGGCACAA AAGGCTAAAG 

GTACTATGGC AGTTTATGGA AAAGAAGGTC 
AAACACCAAT GAACCCTACX5 ACCAACTATG 

30 TACAAGAATT GATTAGTGAT TCGTTTAAAG 

GTGCACATTG CCCAGGAAAT TTCCAACGGT 
TTCCCAATAT TAACAATCAG CGCAGTGCAT 

^ ATCAATTAAT ATCATTAGAA GTGACAGGTG 

ATACATCGTC AGTAATGTAT GAAATACGTC 

ACAf GCCTTC AATGCTAAAT AAGTATTTTA 

40 

GCAATTTAAT ATACAGCAAT ACGTTATATG 

GAAAAATGTC ACTTGTTATT GCGGACATCA 

AAGTCATCTA TTAAATAAAA TCAACATACA 

45 

GTTAACAGTA GTTGGCTTAG GTTATATTGG 
TGGcGTCGAT GTGCTTGGTG TTGATATTAA 
^ TCAAATTAGT ATTGAAGAAC CTGGATTACA 
AAAATTGAAG 6TATCTACAA CGCCAGATGC 



TATGGTTTAG TAGTTTTAAG TCCGATTCTG 1800 

TCACCTGGAC CAGCCATTTT CAAACAAAAA 1860 

ATTTATAAGT TTAGATCAAT GAAAATAGAC 1920 

TCAACATCGT ATATAACAAA GACAGGGAAG 1980 

CCACAATTAT TGAATGTTTT AAAAGGAGAA 2040 

TATAATCAAT ACGAATTAAT CGAAAAACGT 2100 

GGTGTGACAG GACTAGCTCA AGTGATGGGG 2160 

GCGTATGATC ATTATTACTT AACACATCAA 2220 

AAAACAATTA AAAATATCGT TACTTCAGAA 2280 

TAATTACAGG CXSTACATGGA TATATCGGTA 2340 

GACATCAAGT AGATCAAATT AATGTTAGGA 2400 

ATTATGATGT TTTAATTCAT ACAGCAGCTT 2460 

TATCTGATTA TATGCAAGTG AATATGTTGC 2520 

CTGAAGACGT TAAACAATTT AnTTTATGA 2580 

ATGTTGGTAA ATCAGATCAA GTTGATACAC 2640 

GTATTTCCAA AAAGTTCGCT 6AACAAGCAT 2700 

TAGCAATTGT GAGACCACCA ATGATTTATG 2760 

TAATGCAATT GTCAAAGCGA TTGCCAATCA 2820 

TATATATTAA ACATCTGACA GCATTTATTG 2880 

TGTACCATCC TCAAGATAGT TTTTACTTTG 2940 

GCCAATCACA TCGTAAAACG GTATTGATCA 3000 

ATAAGTTGTC GGTCTTTAGA AAATTATTCG 3060 

AAAATAATAA TGCACTTGAA ATTATTCCTG 3120 

TGGATGAAAC GACAACCAAA GATAAGGCAT 3180 

AATCGTTTTA TTTGGAGGTT ATAGTATGAA 3240 

TTTACCAACA TCAATTATGT TTGCAAAACA 3300 

TCAGCAAACG ATTGATAAGT TACAAAGTGG 3360 

AGAGGTTTAT GAAGAGGTAC TGTCATCGGG 3420 

ATCTGATGTT TTTATCATTG CCGTTCCGAC 3480 
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TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATTATT GTAGAGTCGA CAATTGCGCC 3600 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3660 

AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3720 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 3780 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 3840 

GAGTAAGCTA ATGGAAAACA CATATAGAGA OGTGAACATT GCTTTAGCTA ATGAATTAAC 3900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATCG CAAACAAACA 3960 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC G6TCATTGTT TAGCTGTTGA 4020 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 4080 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAQCAAATCA TCAAAGTCTT 4140 

GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4320 

25 AGACGCATCG CTA6TATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 4380 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 4440 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAATATATTT AATTTTATCG ACAAATAAAA 4500 

30 TGTGTCAAAC TAGGGCATAC . ATQATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4620 

AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4680 

TTACAGCACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT ATTCAAGCTG 4740 

ATCM^TTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCX3AATGCAC 4800 

TTGCTAAACT TGATAGCATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4860 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4920 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4980 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 5040 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACOCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 5160 

so GCAAGAAAGT TGTTTTACTA ACAGC6CATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 5220 

AGATTTTTAA AGCAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 5280 
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GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCXST 5400 

ACCTCXTTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 5460 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCX5AGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 5640 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAACCC CTAATCATGA AGTTGGTTTA GACAACCAGC 5760 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 5880 

TTTATGAGCT TCTTTAAATA CATCGGAATT CAACCAATTA TTAAAGCTAT CTTCAGATTC 5940 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 

CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATOGT 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 6120 

25 CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 6180 

TTGTATGATA ACCATTATGA TTAATCCTAC ACGGACTGCA AGAACATCCA CCATATAAAT 6240 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6300 

50 GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 6360 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GCTACGAGTA AAAAAGGGGT 6420 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAT 6480 

GACGATTAAA ATAAGTCGCA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT 6540 

TGTTTTAGTA ATATAACTCA TGCTAAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTCATA 6660 

ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTTATATTA TGTGTTTTAT 6720 

TTTTGAAAAG TGCAAXATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACXSCTTACTA 684 0 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATGCTAAGA GATTTATATT 6900 

ATAGCCAATA AACAAAGGAG AGA7AATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

SO AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGACGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACACTA TCACATATTA CAAGAGCAAA AGATAAAGAT 7080 
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TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 7200 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 7260 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 732 0 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 7380 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 7440 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCX5T CTTCAACACC ATTAAGTTTA 7500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGT6TAGA TAAATTATCA 7620 

TTTACX3GGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 7680 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 7740 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 8040 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 8280 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 834 0 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 8400 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 8460 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8520 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 6580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGOGAT GCGGCTATTA TAAGGACAGA 8640 

GTTGTTTATT AATTATGGT6 ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 8700 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 8760 

so CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8820 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8880 
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GATGATGTAT AAATCATGGT TAATTACGGA 
GAATTATTTT TAAAAGCGAC AATATTAAAT 

5 

ATGAATGGGA AAAAGGCGAA TACGATAAAC 
CAAAAAATTC AACAAAGTTC TAAAAAGACG 
TTTACAGTGA TTGAATTTGT CGGAGGTTTA 

10 

TCATTTCATA TGCTTAGTGA TGTATTAGCA 
GCAAGTAAAA AGCCGACTGC ACGATACACA 
GCATTTTTAA ATGGTTTAGC ATTAATTGTA 

IS 

GTACGTATTA TTTATCCGCA ACCAATTGAA 
GGTTTACTCG TCAATATTAT TTTGACTGTT 

2^ AATATCAATA TTCAAAGTGC ATTATGGCAT 

GTCATCGTTG CAGTTGTATT GATTTACTTT 
AGTATTGTAA TTTCACTCAT CATTTTACGT 

2S tTAATTTTAA TGGAAAGTGT GCCTCAACAT 

AAAAACATAG ATGGCATATT AGATGTACAT 
CATTATTCAT TAAGTGCCCA TGTTGTGTTA 

30 GCGATTGATC AAGTATCATC ATTGTTGAAA 

CAAATTGAAA ACTTGCAATT GAATCCATTA 
ATAAAACATT GTAGCGCCTA AAACATTAAT 

^ CTTATGTTGC ATCATTTAAA TGATTTTCGT 

CGACiATCTTT AGGTTTCAAA ATATGAATAT 
CTATGATGTA CCTTTGACCG GCCATTGTTT 

40 

TTGCTACGAC AGATTCTTTA TCCATAATGA 
TACCCTAACA TGATTTTTAT ACTCTTTGAA 
TTAAAAAAAT ATCTTAATAT CCTTGTAATC 

45 

CATtGTTATA GGAGGTCTTA TTAATGACAT 
TTGCATCAAC GAAAGAAGAA CTAGAAGCAA 
CAACATTAAT TGAAGTACAA GCTACTGAAA 

SO 

CAAATGACGA aGCAGAAGCT AAACAATTTT 



AGCATTAATA 


TTAACCTGAG 


AAGCTATAAA 


9000 


ACGACGCATT 


TATTTAGGAG 


TGGCAAACGT 


9060 


AGATACAAAT 


ATTTTCATCA 


TGTCAATCAT 


9120 


CTGTGGGCAT 


CACTAATCAT 


CACATTGTTA 


9180 


GTATCTAATt 


CATTGGCATT 


ACTGTCAGAT 


9240 


CTTGGTTTAT 


CTATGTTGGC 


CATTTATTTT 


9300 


TTTGGATATT 


TAAGATTTGA 


GATATTAGCT 


9360 


AITTCAATCT 


GGATTTTATA 


TGAAGCTATT 


9420 


AGTGGCATTA 


TGTTTATGAT 


TGCTAGTATT 


9480 


ATCCTTGTAA 


GGTCTTTAAA 


ACAAGAAGAC 


9540 


TTCATGGGAO 


ACTTATTGAA 


CTCTATTGGT 


9600 


ACAGGATGGC 


GCATCATC6A 


CCCAATCATT 


9660 


GGTGGTTATA AAATTACGCG TAATGCgTGG 


9720 


TTGGATACTG ATCAAATTAT 


GGCAGATATT 


9760 


QAATTTCATT 


TGTGGAGTAT 


TACAACAGAG 


9840 


GATAAAAAAT ATGAGGGTGA 


T6ATTATCAA 


9900 


GAAAAATATG GCATTGCACA TTCAACGTTG 


9960 


GATGAGCCAT ACTTCGACAA ATTAACATAA 


10020 


CTATGTCATA 


GGCGCACGTT 


TCGTTTTATA 


10080 


CAATTTCTTT 


GATGCTATCT 


ACATCTAACA 


10140 


GTTTTTCATC 


ATTTGTATGT 


A?lAATGCGTT 


10200 


CTACAGCAAT 


CTTTTTGTTT 


CTAGCTAAAC 


10260 


TAGCCCCCTA 


TATATATGTT 


TATTTACTTA 


10320 


AATATATTTT ACAGAATTTT ATCTAAATAT 


10380 


CGATAAGAAT 


TATAGTAATA 


TTTTTTCAAC 


10440 


TATTTTTATT 


AGAAGCTAAC 


AATCTTGATT 


10500 


AGGCAGCATC 


ACTATCTACG 


AAGACAATTC 


10560 


ATTTAACTCA 


TGGTTATTTT 


ATTGTGGAAG 


10620 


TAACAGAAGC 


AGATATTAGT 


ATTCAATTAG 


10680 



SS 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10800 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10860 

5 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

10 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAAOGCTTA TGTTATAAGA GCACTCATAC CAAACCA7AA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATA6TA6GGA TTAAAAGGGG 6CTTGTCATG ATTAAAATTC AACAATTACA 11340 

20 ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 11400 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 11460 

TGGATTTATT CATCCATCGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCT.CCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 11640 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAACGTTA TATTGATGGA 11760 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 11880 

35 TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

40 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 12240 

45 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT' TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

^ CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCAT6AAG 12420 

GCAATGTCAT TATGQGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGOGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT 

TGCCAGCCGC attgagtga;i CACAGAATTA 

^ CACTGGGTGA AAAGTTAGGC AAAGGTAAGA 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG 
CGCAAgCATT TGTACAAGAT TATAAAAAGT 

10 

GTGTAGACAT TATGACGCAT CATTTTAAAC 
CATGGACATC CTATGGTGAT TTAACAATTA 
TGGTAAAACA ACATCATTTG TTTAATCCAC 

IS 

TGTATAAGGA GGCATCGCGT TCATGACACG 
TATCACATTT ATTATTTTCT TAGGCATTTG 
ACCTGTATTG TTACCGGGTC CTGCTCTTGT 

20 

T6QAGAAATT TTCCAACATT TA6CAATTAG 
CGCATTGTTG GTTGCTATTC CATTQGGCTT 

25 CGCTATCGAA CCGCTATTTC AATTGATTAG 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT 
TTTrrrCCCA ATTCTGrrCA ATACTATTAA 

30 AAAAATAGCA GCAAATTTAA ATTTAACTGG 

CGGGGCATTT AAACAAATCA TGGCTGGGAT 
TTTAGTTTCT GGTGAAATGA TTGGTGCACA 
ACGAAATATG TTGAACTTAG AAGATGTTTT 
TTTXATTATT GATCGATTCA TTAGTTATAT 
ATAAGGAGAG ATGATGATGA CTTTAGAAAC 

40 

AGTAGAAGTT GATGAAGGGA CGTATTATCC 
TGGTTATTTC GGTGAGGCGG CATTGAGAAA 
GTCTTGTTTG ACAACAGGAT TTTGTTTATG 

4S 

AAATGCCACG CAGCCACATT TAAATAATGA 
ATTAGGTGCT ACCGGATTGT CTAATCCGAT 
CCTTGAACAC ACTTATGTTG ATGGACAATT 

SO 

TAATATTCAA GAAGACCATT ATTTTGGTGC 



TTAGCTATCA TGAAATGTCX3 CCAGCAGAAA 12600 

CAGGGTATTC TGTAGCCGAA CCATTCGGTG 12660 

CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

GGAAATGGTC ATTATTATTG GGCATTACCA 13X40 

AGGAAAAAGT ATATGQTCTT TCATTGTTAC 13200 

TTTATGGAGA TTTGTAGrCGG GCTTTGTTGT 13260 

CTTGCTTGGA AGGAATC6TT GGCTATACAA 13320 

GCCGATATCT CCGATAGCAT GGGCACCATT 13380 

6CCAGCGATT GCGATTATTT TTATCGCTGC 13440 

AGGCGTTAGA GACATTGAAC CTCAATATTT 13500 

GTOGTCATTG TATCGCAATA TATTATTTCC 13560 

ACATATGGCG GTAGGAACAA GTTGGATATT 13620 

ATCGGGATTA GGTTTTTTAA TCGTTGATGC 13680 

AGCAGCAATA TTCTTTATCG GATTATTTGG 13740 

TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13800 

GCTTATCAAA GAACAATTAG ATCCTCATTT 13860 

GAGAACATTT ATTCAGCAAT TATTTGTAGA 13920 

AAATGCTGAA GTAATCGAAG CTGTATCGCA 13980 

GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14040 

CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

GAAGTCATTT AATGATTTAG AAAAGTTGAA 14160 

GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

GATTTGGAAA CATGAATCAT CAGATGAATT 14280 
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TTTAGGAGTC AACGGGTCAG CAACGTATCA AATCACATTG AATCAAGTCG TAGTGCCACA 14400 

ATCACAAATT ATCACGCATG ATGCGAAGCA GTTTGCGGCA ACTATTCGCC CGCAATTTAT 14460 

5 

TGCTTACCAA ATTCCAATAG GATTAGGCTC AATTAAAAGT TCTTTAGAGT TAATTGATGC 14520 

ATTTTCAAAT GTGCAAAACG GAATAAATCA ATATTTAGAG TATGATGTTG AAGCTTTTAA 14580 

AAAACGTTAT CGTCAACTTA GAGAGGAATA TTATGCAATA TTAGATGACG GTAACTTAAC 14 640 

10 

TTCACATTTA AATGAATTAA TATCATTGAA GAAGGACATC GGCTATTTAT TGTTAGATGT 14700 

AAATCAAGCT TCTGTTGTCA ATGGTGGTTC TAGAGCGTAC ACACCATATT CGCCACAAGT 14760 

TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC AGCATTGACA CCX3ACATTAA GACATTTAGG 14820 

IS 

TAAACTTGAA GCAGA6TTGA AGGGGTAAGT 6TGATAAGCT 6ATTTTTTGT TTAGATGCGT 14880 

TTGTTGAAAC ATTTTTTAAA ATAATATAAA TCTTAGTTTA TAAACATTTT CTGTTAATTT 14940 

2Q GTTATATCCT TTTAACTAGG AAAATATACA TTTCGTAATA ATAATAATCG TTATCATTGA 15000 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT QTGAACAATT AATGAACTTC TTATTTTAAA 15060 

GAAGGTGAAT ACTATAGATA CGCATACTAA AGAACAACAA TTCTCGAATC TAGTAAGATC 15120 

25 TTATCGTAAA GAATACGTGG GTAAAGGACC CAATAGTATT CGAGTGTCGT TTAAAGATAA 15180 

TTGGGCGATT GCACATATGA CAGGTGTTTT GAGTAAAGTT GAGAGTTTTT ACCTAAACGA 15240 

CAAACXSCAAT GAATCGATGC TCCATTATAC ACGCAGAGAG AAGATTAAAC AGATGTATAA 15300 

30 AGAAATAGAT GTAAATGAGA TGGAAAGTCT TGTAGGCGCT AAGTTTGTAA AATTATTTAC 15360 

AGATATTGAT TTGAATGATG ATGAAGTCAT TTCAATATTT GTTTTCGATA AGTCAATAGA 15420 

ATAAGTGTTG CTGGTGTAAG GTACACGGTG CTGTTTGCTA ACTTCGCTTT GAATTTAACA 15480 

^ ATAATTCAAG GGGGTGGTAT GTCAAACGGT GCCGTTTTTT TGTCATATTT TTAAAACAAG 15540 

CAAdATGCAA CACGTACTTT AAGGAAGTCA AAATTTATCA TTTAGGAGAG ATGGATATGA 15600 

AAATCGTAGC ATTATTTCCA GAAGCAGTAG AAGGTCAAGA AAATCAATTA CTTAATACTA 15660 

40 

AAAAAGCATT AGGATTAAAA ACATTTTTAG AGGAAAGAGG ACATGAGTTC ATTATATTAG 15720 

CAGATAATGG TGAAGACTTA GATAAACATT TACCAGATAT GGATGTGATT ATTAGTGCGC 15780 

CATTTTATCC TGCATATATG ACTCGTGAAC GTATTGAAAA AGCACCGAAC TTGAAATTAG 15840 

45 

CAATTACAGC AGGTGTAGGA TCTGACCATG TAGATTTAGC GGCAGCAAGT GAACACAATA 15900 

TTGGTGTCGT TGAAGTTACA GGAAGTAATA CAGTTAGTGT GGCAGAACAT GCX3GTTATGG 15960 

ATTTATTAAT ACTTCTTAGA AACTATGAAG AAGGTCATCG TCAATCAGTA GAAGGTGAAT 16020 

SO 

GGAACTTGTC TCAAGTAGGT AATCATGCGC ATGAATTACA ACACAAAACA ATTGGTATTT 16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

^ ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 1S3B0 

TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 1644 0 

10 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 15620 

75 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 16740 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16800 

20 

TGAnAAATnT CATTCATGTG GnAATC 16826 
(2) INFORMATION FOR SEQ ID NO: 47: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TATJg^CTA TTGAAAAATT CTTGT6ATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 180 

AGCTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 240 

ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

45 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 540 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

SO 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCG6ATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 780 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 840 

^ TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 900 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

10 

CTTCACCATA TTCAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 1140 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA X200 

IS 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGT6AG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

ATATCAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 1380 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AA6TTGGTTC ATTTTAGAAA 1440 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 1560 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 1680 

3Q ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 1740 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 1800 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1860 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACGCTATGT 1920 

TATMTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 2040 

GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACGCCGAAA TCGTTCAAAA 2220 

45 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACXSCGATG GCTTCTTAAC GCGTGATATT 2340 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

SO 

GTAAA6TTGG ATCCAAAAGA AAATATAGCG AAAGTAAATC AAGAATTGAA TGAACTTCAT 2460 

SS 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 25 BO 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2B20 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 28 80 

GTATTGAATC AGCCATTGTA CCAATTAATT GAGCAAGTTA AACCGCGTTT AACAAAAGCT 2940 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAOA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

2^ TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 3240 

CAACAACAAT GCOGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTO 3300 

25 ACAATCAAGT AAGATCATTG GCATATGT6A AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 3480 

30 GTAATTCXSAT QGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3540 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 3720 

caacSagaac aacgccactt ggattgttcc tuvtttggtga taaagataat caattgttgt 37 bo 

tgatggtcaa tatcaatctg ttgatgcttt taaaaacata aatattccat taacgtggca 3840 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CXSTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

4S 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
^ (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

CAATATAGGT CGCCGAGTTT CAACTaCATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

5 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 240 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 300 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT 420 

IS 

GTCACAAGCG TTAOGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 480 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 540 

2^ TGGACGTGCA TTAAAATTCT ATAOTTCAGT AAGACTAGAA GTACGTCGT6 CAGAACAGCT 600 

TAAACAAGGA CAAGAAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 660 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 720 

2S GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 780 

TTCTTACAAT GGOGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 840 

AAATCCACAA ATTAAAGAAG AAATTGATCG TAAATTGAGA GAAAAATTAG GTATATCTGA 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACX3AAG AATAGTACAC 960 

AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 1080 

35 AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC 1140 

TTTASAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 1260 

40 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT GAATTTATTA 1320 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 1380 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 1440 

45 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

ASACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

SO 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 1680 
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10 



IS 



20 



CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1800 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 1860 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACX3ATATTA TTAGAGAAGC AGGTGAACAA 2220 

GCTACATTTO AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTQA AGTTGCGCAT 2340 

CTTGCTAGTA TGTTA6CTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACX3AGCTGGA 2400 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 2460 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

2S CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 2640 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2880 

TCACAAATTA GTGAGGGAGC TTTTTrAAGT TGTAGTCTTA AtCTAGTTAG ACAGCACTTT 2940 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3060 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 

TCTTCTCTTC yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTCT TGCAATTGTT 3240 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTA T TTT G A TGTCTTAATA 3420 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3600 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

5 

AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3 720 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3840 

70 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3900 

AGGAAGA6CG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CAT6CAGAAA CAACTTCTGA 4020 

75 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

2^ GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

25 AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4380 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4680 

TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4740 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4860 

40 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 5040 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

SO 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 

SS 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCC7UVC GAGGTGGACC 5400 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 5580 

10 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 5S40 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 5700 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCXjCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

25 TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 6300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 6480 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATOSGGA GGAGATGGTG ATGGTTATGC 6540 

TATASGTATG GGGCATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG cAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

SO 

TGTTGATGAC ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 



390 



EP 0 786 519 A2 



TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 7200 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7260 

5 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTGGTTTGCA 7320 

TTTTAGAGTA GGTCCX5TTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 7380 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 7440 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7500 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7560 

'5 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 7680 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 7740 

20 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7778 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



35 



40 



AGATGAAGTT 


GTTACgAAAA 


TTGCGTACGC 


TGTTTCAGAA 


CATGTCAAAA 


TAGAAACAGG 


60 


TAATCCATTC 


TTTCAAACAT 


CACATAGTGG 


TTGTGCGACG 


GGCGGATCCT 


GTAATTGTTC 


120 


ATTATAAAAA 


ACATCGAGTC 


AGAAAAAGGT 


GGTTATTGAA 


cCACTAACTA GCATCTGACT 


180 


cgatSttttt 


ATTTATTCGG 


GATTGTTTGT 


TTGAATTGTT 


GTGCTAAATC 


TGGTCGATCT 


240 


GTCACAATCG 


TGTGTGCACC 


TTTTTGGTAT 


AAATCATTCA 


TCAGATTTAT 


ACTATTTACG 


300 


CCATAATAGC 


CTGGAATGAT 


ATTCATATCA 


TTTAACCATT 


TGATAAAACG 


AGATGAAGTC 


360 


AAATCAATGC 


CTTTAAAATG 


AGTAGGCATT 


TGGAACGTTT 


GTGCTAATGG 


TTGGTAGTAC 


420 


CTACCACCTA 


ATAAATGATA 


TTTTAAAAAT 


GCTTCTGTAA 


CTTCCTGTTG 


GCTAGCACCA 


480 


ATTGCGACGG 


ATCCTTGTGC 


AATTTTATTA 


AAACGAACGA 


TTTGTTCTTT 


ATAAAAACTT 


S40 


GTCACAAGAA 


CGCGGTCAAA 


TGCTTGATTT 


TCTGCAATTG 


TATCAAACAT 


AATTTGTGGT 


600 


GCGATTGAGC 


CTTCATAGGA 


TTCAGGAGCA 


TCTTTTAAGT 


CTACGTTTAT 


ATACATATCA 


660 


GGATATTGCT 


TCAGCAACTC 


ATCGAAGGTT 


AGTATAGCTG 


TGTGTGCATG 


ACCACGATAT 


720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA B40 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

5 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

,^ GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

25 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 
GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA IBO 
30 AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 240 
TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

35 

TTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTOTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

40 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 600 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

45 TGTTThTCVh GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 720 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

SO 

CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 

55 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 114 0 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTTCGTTC 1200 

5 

ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGCGAGA GGTACATACT 1260 

TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 1320 

GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 1380 

CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 1440 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCCC CAGCAGCAAA TATTCCAGGA 1500 

'5 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 1740 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

25 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2040 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

^ TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

40 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTmAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2640 

50 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 

SS 
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TGTTGTTTTA AATCAGCATT AAGCATGGTT GTAATGCCTC CTTAGATTTT ACCTACTAAA 2940 

TCTAAACCAG GTTGCAATGT TTTAGCGCCT TCTTCCCATT TAGCTGGGCA TACTTCGCCA 3000 

5 

GGGTTTTTAC GAACATATTG AGCTGCTTTG ATTTTGTGAG CTAATGTACT AGCGTCACGG 3060 

CCAATTCCGT CAGCGTTAAT TTCAGATGCT TGTACAACAC CGTCTGGGTC GATAATGAAT 3120 

GTACCACGTT GAGCTAAACC AGTAGCTTCA TCTAATACAT CAAAATTACG AGTGATTGTT 3180 

TGTGATGGGT CACCAATCAT AGTGTAAGTG ATTTTGCTAA TTGCATCTGA ATGGTCATGC 3240 

CATGCTTTGT GTACGAAGTG AGTATCAGTT GATACTGAGA ATACATTTAC GCCTAATTTT 3300 

TGTAATTCTT CATATTGGTT TTGTAAGTCT TCTAATTCAG TTGGACAAAC GAATGAGAAG 3360 

TCAGCAGGAT AGAAGCATAC TACGCTCCAA GAACCTTTTA AATCTTCTTG TGTAACTTCT 3420 

TTAAATTGAT CTTTTTTTGG ATCGAAArCT TGCGCTGTAA ATGGTAAGAT TTCTTTGTTA 3480 

20 

ATTAATGACA TAAATATCTT CCTCCTAAGA ATTTAAGTAT GAATTAGAAC TATCAATTGA 3540 

TTGCGCTTAA TTATAATAAT TCTAATCTCT TAGTTAGCAT TATTACATTT TGATCCAGAA 3600 

TAGTCAACTG GATAACTTTG TAAAGTGAAT GATTACTTTT AAAATAAAGA AAGATAATAT 3660 

2S 

AAAGTGCTTT GATAATGGAT TTTGTAGTTG ATGATTTAAA AGGTTGTGTC TATATTTAAT 3720 

ATCTTGATTT TAATGTAAAA AATGTAAAAA AAGAAGATTT GTATTCTCAA CTAAGTCAAC 3780 

30 CTTATTGATA ATGGTATGAG AATATTTGTT CGAGATGGAT GAAGGTAATG A6TGAGAAAC 3840 

TGGATTTTTA AAGTATGAGA CAATATTTTA AAAAGTTCAA TTATTAACTT ATAAGCAAAT 3 900 

AATTGCTATA AAAAAGTTTG GACGTGTACA ATTGCAATAT GAAGATTTTA AATTAATTGT- 3960 

35 

AAAGTATCGA GGAGTGGGTA ACGTGTCAGA ACATGTATAT AATCTTGTGA AAAAGCATCA 4020 

TTCTGTTAGA AAATTTAAGA ATAAACCTTT AAGTGAAGAC GTTGTTAAGA AATTGGTAGA 4080 

AGC33SGACAA AGCQCTTCGA CX5TCAAGTTT CCTGCAAGCA TACTCAATTA TTGGTATCGA 4140 

40 

CGAT6AGAAG ATTAAAGAAA ATTTACGAGA AGTTTCTGGA CAACCTTATG TTGTAGAAAA 4200 

TQGCTATTTA TTCGTCTTTG TTATTGATTA TTATCGTCAT CATTTAGTTG ATCAACATGC 4260 

45 TGAAACTGAT ATGGAAAATG CATATGGTTC AACX3GAAGGT TTGCTAGTAG GTGCAATCGA 4320 

TGCAGCATTA GTTGCCGAAA ATATTGCGGT AACTGCTGAA GATATGGGGT ATGGCATTGT 4380 

CTTTTTAGGA TCATTAAGAA ATGATGTTGA ACGCGTTCGA GAAATTTTAG ACTTACCTGA 4440 

SO 

CTATGTCTTC CCGGTATTTG GTATGGCAGT AGGGGAACCc GCAGATGACG AAAATGGTGC 4500 

AGCCAAGCCA CGCTTACCAT TTGACCATGT CTTCCATCAT AATAAGTATC ATGCTGATAA 4560 

GGAAACACAG TATGCACAAA TGGCAGATTA CGACCAGACA ATCAGCGAGT ACTATGATCA 4620 

SS 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4740 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4800 

5 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 4860 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4920 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4980 

10 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 5040 

ACCGACGCCA GCAACGCCQA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

IS TTGTAAATCA ATTTCTACAT TAGCGACXX3G TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT 6TCGCAGCGA AATTGGCAAT 5220 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC 5280 

20 

GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 5340 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 5400 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 5460 

2S 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 5640 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 5700 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 5760 

35 CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 5820 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 5880 

ATGAGACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5940 

40 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

4S 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 6240 

SO GACTAGTATG CT 6252 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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75 



20 



25 



(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 60 

10 GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 180 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 240 

ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 300 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 420 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 480 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 540 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 600 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 660 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 720 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGtAGCTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 840 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACTTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 1320 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 1380 

50 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 1560 

55 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1680 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 1740 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 1800 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG 1920 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTGCGA 2040 

TTTCTGGATT CTTCGGTGQA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCX5TAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2220 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 2280 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 2340 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 2400 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 2460 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 2640 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2700 

GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2880 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 80 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 3540 

5 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

,^ CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3840 

IS TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4020 

20 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 4140 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4200 

2S 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4320 

30 ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 4380 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4440 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4500 

ATAATCATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 4560 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4620 

ATGdAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 4680 

40 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4800 

^ TGACTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4860 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGT6ACTGC ACAAGACTTT GTTTATGCTT 4980 

^ GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TCCATATATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC 5280 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

5 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA 5400 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAATAAATAT AAAGACAACA 5460 

1Q AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCXSCA CAAGCAATAG ATAAAAAAGG 5580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 5640 

AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

20 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 5940 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

25 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC 6ATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATGCACCG GTAGCACCAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT 6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 6240 

TAAATCGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6300 

GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAAACTTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 6480 

40 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

^ CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAA6TG 6660 

GTGTTTATTC TTCGTTATTA AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: doTlble 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

^ AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 180 

10 TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 240 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 300 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 360 

TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 420 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 480 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 540 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 600 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 660 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 720 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAGCAAATAA AGGGATCTTA GATACTGAGC 840 

30 AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

ACCGTATG7C TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGfiACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA OGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 1440 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

50 TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 
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10 



aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 1740 

ATAATACGGT TGTATTATCG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 1920 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

GTCGTTTTGC AATTTTAtAT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2040 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 2X00 

IS GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT OGAGCGTGTT CAACAACTGc 2160 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 2280 

ATCATCACCA ATATAGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 2340 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 2400 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 2520 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGC6TA TTAATTGCAA GTAGCAGCAT 2760 

55 TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2820 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTOCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 2940 

AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3000 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3060 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 3120 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3240 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3300 

GGAACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT 
GCGATTTATG GATTAGGCCC CATTGAAATT 
ATTTTAGAGA CTGTTTACGG TATGACAACT 
GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG 
GCaGCACGTC AAGGTTATGG tGAGGTACAT 

10 

6CAATAGAAG CGTGGGCAAC GAAATATAAA 
AATCGTTTCT ATTTTGAAGC ATTATATGCA 
IS ACAGATGGAC CAGGATTTAT GGAAGAT6AA 
TTACCACCAT TTTTAGAAAA TAAAAGAGAA 
ACGAAGCX3TC AACATGGTTA ATTGGAATGA 
GAAGGACAAA ATGGT6CGCC AACACTAATA 
GATTTATTAC CGTTAGGCGA AgcATTGAAT 
CAAGTTTCAG AAAATGGGAT GAACCGTTAT 

25 

GAAGAAGATT TGGCATTTCG TGGACAAGAA 
CGTTATGATT TTGaTATTGA AAAAGCAGTA 
GCGATTAACT TAATGTTGCG TTCAGAAGCA 

30 

TTATACCCAG TTGAAGTAAC GTCAACAAAG 
ATGGGGAAAC ATGATCCAAT TGTGCCATTA 

3S AATACACGTG GGGCACAAGT CGAA6AAGTT 
GGATTAACGG CTGGTCAACA AATACTTGGG 
TGGAAAAGAT TTTTACTTTT CATCTGCCCG 

40 TTTACAATAG TATAGATATT TTAATCGATA 
CCTTTATAGA GTACAGGTAT GAGTAAGATG 
TTTCCAATTA ATGAAATGAa ACCGATAAAT 

45 

ATAATAATGA TGAAGTAACG TCTGCTGAAT 
ATTAATCCAA CAACAGTATT GTAGATGACA 
ATTGACGGAG ACATTTGTGT CGCTAATTTT 

SO 

TCGAATTGAG AAATTAAACC TAGATTAATC 
CCAATCAAGC CCCCGTATAA CXSTTGAGTCA 

£5 



TGGAAGAATG GACCGGTTCC AGTAGATAAA 3540 

AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

CATGTGTCAT TTCGTGTGAA AGATCATGAT 3780 

GAGGTAGGTA TTAATAACTC AGGCATCGTT 3840 

CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3900 

CCTTATGAAA CATTAGGCGA AGGGTTATCC 3960 

TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4020 

GGAGGATTTG TGATGGAACA TATTTTTAGA 4080 

TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GAAAATTATC ACTTGTTAAG TATTAGAGGA 4200 

TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4260 

TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4320 

CTTGTTGGAT TTTCAAATGG ATCAAATATA 4380 

CCATTTAAAA AAGCATTGTT ATATGCACCG 4440 

GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4500 

GCTGCAAGTG AACAAGTCAT TAACTTGTTr 4560 

TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 4740 

TGAGATTTGC CGGTAATACG CTTAATTATUV 4800 

AAACCGAACA ATCCCATAAT AGGGAATACT 4860 

GTACTAATAT AAGTGATGAC AGCCATTGTA 4920 

GGAACGCTGA AACGTGACGC AAATGCATAC 4980 

AGTATCATAA TGACAGACAT AATAATACCA 5040 

AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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CCAGGTGATA ATGATTTCTG CTTATGAATC TGAGCATCAT TATTAGCGGC AGTAAAATCA 5340 

AGATGACTTG TTGTGAAATA GTAGACCGCA ATCATAATGA CAATCGCAAT TAAAAATGGG 5400 

GTAACACCGC CAAGCACAGC AATTAAACGA TCGAATTTTA GAAACAGTGT TGCTAAAATA 5460 

AAGGCGACTA ATATGAGTGC GCTCAGCCAA TACGGTAAGT TGAAACTTTG ATGAATGGTT 5520 

GACGCACCAC CTGCAGTCAT AATAATAGCT AAAGACAACA TAAACATTGT TAAAATAATA 5580 

TCAAAACCTC TTGCAATAGA GGGGTATAAG AAATAGTTAA TTGAATCAGA ATGATTTCTG 5640 

GACTTTAGAT GATGACCTGT ATGCATGACA ACCATTCCAC CTAAAGTAAT CAATAGTCCT 5700 

IS GTTACAATAA TGCCTGAAAT GCTATATGCG CCATGACTTG TGAAAAACTG GAAAATTTCT 5760 

TGACXAGTAG CAAAGCCGGC ACCAACX3ACA ACACCAACAA AGGCAAATGC CACAATAATG 5820 

GACTCTTTTA AGATACX5CAT GATTTAAAAA TGTCCCTTCG TAATTTTAAG TAATATAGAA 5880 

AATGTAACAT ACATGTTAAT GAAAAATATA GTACTAATAT AGTATTTTGT TAAATTGGAG 5940 

TAGAAGCGAG GGTGTCGGTC ATTTCATTAA TTTATTAGTT GATTTTGCAT TTTTTTGCTG 6000 

TAAAGTTGTT ATAATACAGT TAACAGGAAT TAGCATAGAT ACACCAATCC CCTCACTACT 6060 

CGCAATAGTG AGGGGATTTT TTTCGGTGTA GCTAGGTCGC CTATTTATCA TCGTGTTTGC 6120 

GTAgCaATGC GTAAACACAG TACCACTAAA TAAGTGCACG ATACATGCAT CAAATGTCGT 6180 

CTTTAGTcTA AGTAACGATC ATGCATTAAC ATTTTCAAAA TATCTATTTG AGCTTGAAGA 6240 

TCTTTACCAA TATTGGTATC ACGAATCTTC TTACGTTGTA ATTCTTTATC TACGACGCGC 6300 

TTTATAGAAA GTTCATCGAT ACCTTCGGAA AGTATTTTTn CTTTAGCGTT AAATTGTTGG 6360 

j5 TGTGCAACGA GTTGCATACC GAATGAATTA TACAATAGTG TATAGCCTGC AATGCCAGTn 6420 

GTTGACTGAT AAGCTTTTGA AAAGCCACCA TCAATGACAA GCATCTTTCC ATCAGCCTTG 6480 

AT - 6482 
(2)' INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



25 



30 



so 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 
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AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 24 0 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCX3A TATTTCATTT AAGACACTGC 300 

5 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 420 

ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 BO 

10 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCX3T 540 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 600 

IS GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 660 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 720 

TTAAAAGGTA ATCXSACTATT CTATTTAGCA ATGGCACCAC AATTCTTTGG CGTTATTTCT 780 

20 GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 840 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

GAATCAAGTG GCGCGCTAAA AGATATOGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 1140 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 1200 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

3S CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 1440 

^ CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAACCTAAT GAAGGTGgTA TCTTTtACAT 1560 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 1620 

45 

ATGaGCGcTC aAGaXAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 1680 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 1740 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

SO 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGAGGATATT CAATAA7TGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 
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TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 2040 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 2160 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 2220 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 2280 

TGACACTCGC TGCAAATGAT GTGATTAATG GTGCTACACG ACTTGTAGAT ATCGCTAAAA 2340 

AATATCACTA TGCAAATTCA AAT6ATTTTG CAAATGATTT TAGTGATTTT CACGGCGTAT 2400 

CACCTATTCA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 2460 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2520 

CATTOOTTGG ATATGCAOGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 2640 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2760 

AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2820 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGCCATATGA ACGAAATGAT TTATATGTTG 2880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 2940 

CTGTTAAACA GA6TCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 3000 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGITTT AAATTAACTT TGGCATCATA 3120 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAiSTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 3240 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 3300 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 3360 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 3480 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 3540 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 3600 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 3660 

ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3840 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3900 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4020 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4080 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 4140 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 4200 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAGCATCAT ATTCCCATGT 4260 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 4380 

20 ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 4440 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4560 

2^ AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTC CAGGTAAATC 4620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4680 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 4740 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4920 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4980 

AGCATCTAAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 5040 

TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 5100 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 5220 

4S ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 5400 

SO 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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AGTTCTAGCA GCGGTATTAG GAAAATAATT CAATTCTTCC ATAACTTTCT TCACTTTTGA 5640 

AATTGTCGCr TCGCTAATAC GTTGATTTCC TTTTATAACT CTTGAAACTG TCGAAGGAGA 5700 

5 

AACACCGGCT TTTAGTGCAA CATCTTTAAT CGTAACCATT TAATCACCTC CTGTTAATTT 5760 

CTGCATCGGA AAACGCTTCC AACCACTGTA TAATACCAGT TTAGTCACAC TTTCTAAAAA 5820 

AGTCAAAAGA TTTGTGCAAA CGATTGCATA AAACGATAAA AATAAAACCT TCATACTGAA 5880 

10 

ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT GTTTAGTCAC 5940 

TAACTGCAAA ATAGTTACCT TGGCCATCTT GAAAATTAAA TACACGTTGA CCATTCATTT 6000 

^5 CTACTATATC ATGCCCAGTT AAACCTAAAT CATTTAATTT TGAGTATAAT GCATCAAAGT 6060 

TTTTCTCTTT AAACATTAAA GATGGTGTTC CTAGGTTCAC TTCCX3GGCTA TGCTTTTCAA 6120 

TAAATTCTTT TGCCATAATC GTCAATGACG TTTCAGCATC TTTGGTAGGT GATACTTCAA 6180 

20 CTGCAACATA GTCCTCAGCT AACGGTGTTT CACTTACAAC AACAAATTCT AAAGTTTCTG 6240 

TCCAAAATGC TTTCGCTTTT TCGACATCAT CAACATATAA CATAACTTGA TTTAACTTTT 63 00 

CCATAAAATA GTACCTCTAT TTCTCTATAG TACATGCTAT CATAACACAG TAAATATTTT 6360 

ATTACTTCAC AAAATGCTTA AAAATATGGC GGGATGCTTT TAAGGTCAAG GATAATACTT 6420 

GTGTAATTTT TTATAGGTTG TAGCTACTCT ATCACACTCT CTTTTATATT TATCAAAAGA 6480 

TATAAAAAAG GATAGTATCT TTCAACTATC CTTTAATCAA TATTATTCTT CAATCCATTG 6540 

30 

TGTATGGAAT ACGCCtTCTT TATCTTTTCT TTCGTACGTA TGAGCACCGA AGTAGTCACG 6600 

TTGTGCTTGA ATTAAGTTTG CAGGTAAATC AGCAGCACX3K3 TAACTATCAT AGTAATTAAT 6660 

ACTTGATGAG AAACCAGGTG TTGGTACACC ATTTTGAACA CCAGTTGCGA CT^ACATCACXS 6720 

35 

TAACGCATCT TGATATTCAG TAACGATGTT TTTAAAGTAA GGATCTAGCA ATAAGTTTTG 6780 

TAAICCTGGA TTATTATCGT AAGCATCTTT GATCTTTTGT AAGAATTGTG CACGGATAAT 6840 

40 GCAACCTTCT CTCCAAATCA TAGCTAAATC ACCAAGTTTT AAATTCCATT CATTATCTTC 6900 

ACTTGCTTTA CGCATTTGcG CGAAACCTTG TGCATAAGAA CAAATTTTAC TCATATATAA 6960 

TGCTTTACGA ATTTTTTCTA AAAAGTCTTT CTTGTCACCA TCAAATGATG CTTTTGGACC 7020 

^ ATTTAATTCT TTAGAAGCAT TTACGCGCTC TTCTTTGaTT GAAGAGATAA AACGTGCAAA 7080 

TACAGATTCA GTAATGATTG TTAATGGAAT ACCTAATTCT AATGCGTTAA TTGAAGTCCA 7140 

TTTTCCTGTA CCTTTTTGaC CTGCAGTATC AAGAATTTTT TCAACTAATG CTTCTTTATT 7200 

SO 

TTCATCTAAT TTCATGAAAA TATCACCAGT GATTTCAATT AAATAACTTT CTAATTCACC 7260 

AGCATTCCAG TCTTTGAACG TTTGAGCAAT GTCTTCATGA GACATGCCTA ATAATTCTTT 7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TCCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATCAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTCTTG 7S00 

TCCACCCGGC ATTAATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 7680 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7800 

TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTrGA TTTTTCATTA GTATTGTATC ACAAAATAGA 7980 

20 CATACACTAC ACTAAATCAT TTOGAATGTC GCGCAACTAT TTTGATTATT TCTAACACTT 8040 

GACTTOCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

2^ AACCGCCACC AGAAATAATT GTATTTGCAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 8220 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGATAACTT TGCTCCACTG 8280 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATCA GTCATATGTT B340 

TGACTTGTGT TTTTATTCTT TCTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT 8400 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCX5CAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATnTGCGT OGGTGCACCT ACAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8640 

AACCAAAGTC CGCGTCCAAC AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 

CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG 8820 

45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA 8880 

CAACAGTATC CATATGGCTC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 

TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT 9000 

SO CTTTAACATC TAACCCTAAT GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 9060 

CATTCCCTGT CTCAGAATCG ATTTGTACAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 
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GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

TCATATTATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 9300 

5 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 9420 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

10 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACX3GATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 9780 

20 ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 9840 

TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCX3GAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 10140 

30 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 10200 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 10320 

35 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 10380 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAOTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

45 ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

SO 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC ATAACTAGTG TTGTACCATC 
AGTAGTTAAT TGCTCATATC CCGCAGATTC 

5 

GTTCATGTAT AAATCGAAAT TCATGnCCAT 
ACTAtCCTCA TTCTACTAAT TAATAACATA 
ATATTTAGAC ACAATTTTAA CAATATACCA 

10 

AATTTGTTCA CATGTTTTCA TTAATATGTT 
GCAAAAATGC ATTCAACCAT GTTGATTATT 
IS TATTTTAGTG CCAAAAAATA ATACATCCAT 
TAGATGCATC TATGTTATCA CTAATATATA 
CGCTGTTTAA TATGATTCAT ArATTTACCT 
ATTGAAATAC ATAAATTAAC CATGTTACGA 
TTTTTAACGA TTGATTCTAC TTGTAAAATC 

GATGATACTG AACCAAATGT ACCAGTATTA 

2S 

TCAGCTGTCA ATTGCTTATT ACGCGCTTTC 

CCTTTGATTG ACTTTTCGTC TGCATGCTTA 

GCAACAGCAA TTGAAATATT AATGTCTTTA 

30 

CTATTTAATA AAGGATATGC TTTTAAAGCA 
AACGTTAGAT TATATCCTTC TTTATTTTTA 

35 ACAA6ATTTG TAGCATCTAC TTCAATCATC 
TTAACCATAT TTTGCGCAAT TGCTTTACGC 
CTATTGTCTT CAGATGATTG GTTACTTGAT 

40 TGTTTGTCAG ATTGAGCTGT GGTACCACCA 
GTTACACGAC CTTCAAATCC ACTACCTACA 
GCGAGTTTAA ATACAACAGG TGAAAAGCGA 

^ GTAGATGTCT GTTCCACTGT TGCACTAGCT 
ACTTTTGCTT GTATCTCTTC AGTTGTTTCA 
CAGATAATTG TATCAATAGC TACTGTCTGC 

SO 

CCTGATATCG TGGAAGGGAC TTCAGCTGTC 
TCATATTCAT CAATATGATC ACCAACAGAA 
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TTGTTTAAGA ACTTTGTCAA CATCTTCTGC 11040 

AATTTCATTC CTTGCTTGTT CTACAACACC 11100 

AAGTTCAATC ACCTATCCCT TTATATTTAA 11160 

TTGTTCAATA AACTAATCTG AATCACACCT 11220 

AACATTATTG TGCTTAAAAT CATGGTAACT 11280 

TCAAGTATGA TGTCTTATTT TGACTTTACT 11340 

GTTCTTTATC TTTTTTGAAT ATATTGCACA 11400 

GGACAAGAAC AAGATAAAAC AAGTTGTCGA 11460 

TTTGTATTTT CTAAAGTATA CTGTTCGATA 11520 

GTTTGTAAAC CATCTAAAAT ACGATGATCA 11560 

ATTGCAATCA TATCATTAAT TACTACTGGC 11640 

GCTGCTTGTG GATGATTTAT AATACCCATT 11700 

TTTACCGTAA ATGTACCGCC CTGCATATCT 11760 

GTTGCTAAAG TATTAATTTC TCTAGCTATA 11820 

ATCACAGGTA CGTATAATTT ATTITCATCA 11880 

TGTAAGACAA TTTCATTTCC TTGCCAGCTA 11940 

TCTGCTACAG CTTTTACAAA GAAAGCAAAG 12000 

AAGCTGTTTT TATAATGATT TCTCGTATTC 12060 

ATCCATGCAT GTGGAATCTC TGTTACACTA 12120 

ACACCATTTA CTGGTATTGT GCTGTTTTCA 12180 

GTATCTACTG ATGTTGATTT TGTTTGAACT 12240 

TTTTCAATAA CTGACATTAT ATCCTTCTTA 12300 

ACTTGT6ATA AATCAATGTC ATGCTCTGAA 12360 

CCATTATTAC GTGGTTGATT TTGTTTAGCA 12420 

TTTTTAGTAG ATTTCTGAGT ATGCTCATCC 12480 

TTTGTCTTTT CATCAGCAGT TTCAATTTTA 12540 

CCCGCTTCAA CTAAAATTTC TGTAATTGTT 12600 

ACTTTATCTG TAATAACTTC ACATAATGGT 12660 

ACTAACCATT GTTCAATGGT GCCTTCATGA 12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12840 

GGAGAAAATG GCATAGATGG TACAtCTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12900 

5 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACCTTCTAAA 12960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTAT6TTTAG CACGATCAAT AATTGTTTCT 13020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13080 

10 

AAAATATCCG CTGCTTGTAA ACAATAATTG ACCATTAATC CATAACAAAA TACTGTTAAA 13140 

TCTTCACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 13200 

ACTTCTTCCT TTAAGAAACX3 ATAAGCTTTT TTATGCTCAA AGTACAATAC TGGATCATTT 13260 

GATTCGATAG ATGATAATAA AAGCCCTTTA GCATCATACX3 GTGTGGAAGG AATAACAATT 13320 

GTTAAACCTG GCX3AT6AAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCCTCCG 13380 

20 TGAACACCGC CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 13440 

CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13500 

GCAAATTGAA TTTCTGCAAT TGGTCTTTTA CCTACCATAG CTGCACCAAT GGCAGTTCCA 13 560 

25 

ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCACCATA TTTTTGTTGC 13620 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 13680 

ACATCTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGCCTCT AAATAA6ATA 13740 

30 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC 6TACACAAAT GCATAGGCTT CTTCGACACT 13 800 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13920 

35 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC GATATTGGTC 13980 

GTOCrCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 14040 

40 TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 14100 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG ATAATTTTTC 14160 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14220 

TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT 142 80 

TGAGCTACCT TCACCAACAG TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14340 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 14400 

50 

ATTCTTAGCT CTACTACTAA AGTGTGATGG CATTTGTTTT CCACCAGAGT TAACATCGTC 14460 

TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG ATACCCATAT AAGTAAGGAA 14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT 
CAATAACCAC AGTCTTTCAT CTATTTTTCT 

5 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT 
TACGTGAATA GCTCTACTTT CTGCTTTCAA 
AGGATGTGCG TGTGTTGTTA GTCCTAATTC 

10 

TGATGCCTCA TTAATCAATT CTGTTACATG 
TTCAGTTGAT TGATCAATCA CCATTTCGCT 

IS CACTGCTTTA CCAATTGCTT TAAATGGTAC 

CriTGCTTGT TCAATGTTTA AACCGATAGA 
AGGCATCATG TTATAGTTTA CTGGGATTGG 

20 AACACCTTCT TTTGATCCAA CATGTGCCAA 

ATAAATATGT TTATCTTCAG TTTGTTGAAA 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT 

25 

TAGCAACACT TTATCTACTT TAATTATGTC 

GTTAACATTT ATATCATTTT CAGAAAGTTT 

TGACAATGAT TTTTTTAATA GTTGTGAAGC 

30 

ACCTGCTTCT ATAACTGTTA CGTCAACACC 
TCCX3ATAACA CCACCACCAA TAATACCAAT 
ATCATCGCTA GATAAAATTT TATCATGATC 

35 

AGAACCAGTT GCAATTAATA CAAATTGGTT 
TTCGACAGAA ATTGTGCCAC TTTGAGGTGA 

40 GCCATTATAA ATGTCAATGT GATTGTGTTG 

ATTAATAATG TCTTCTTTTC GTGCCAACAT 
ATCAACGCCA AACATTGCTG CCTGTTTTAC 

^ CGATTTAGTA GGAATACAAC CTTTATGGAG 

TATTGCCACT TTTTTACCTA ATTGAGACGC 
TCCACCGAGA ACGACTAAAT CATATTGTTT 

SO 

TATATATCCA TTGAAAATTT ATTAATACAT 
CATGATTGTC TATTTAOTTT GAATGCACAT 
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TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

ATAATCAATC ATGTTAAATC CTCCTATTTA 14760 

TCCTAATTCC ATCAACACTT CAGAGATGGA 14820 

TAATGCCGAG CCATTCATGA ACTGTAACAG 14880 

TGGACCAATC ATATTAATAC CCACAATTTC 14940 

ATACCCTTCG TTTGTGTCAT GGCTATCAAT 15000 

TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

GTTCCCCTCA AACATATGAT CAACAGCCAC 15180 

TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TTGTTTACTT TCAGTTGGTA AAATTCTTTC 15540 

TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

AAATATAGAT GTACCTAGAA TACGTCCOGT 15840 

CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16020 

ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 

ACGTATCGCA GCAACATATC CTGCAGTACC 16140 

CTCTGACATG TTCTTACTCC TAACTAATGA 16200 

AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 1644 0 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16500 

5 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16560 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16592 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
75 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 60 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 120 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 180 

25 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 240 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 300 

AGATTAAAXA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 360 

30 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 420 

AAGATTCTCA AACCAAGAAA AT'nTrCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 480 

35 ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 540 

CTTTTGTTTT TATATTATTT CAATACCCTA CTATATATCA CAACACATAA ATTAAGCATG 600 

ACACTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 660 

40 ATGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 780 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 840 

45 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 960 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 1020 

SO 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCaVTTTCTA ATTCTTTTTG 1320 

ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 1380 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 1440 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 1620 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT IBOO 

TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCiTiTiT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

25 GGTACCTCTG GCGTT6GCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 2040 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 216'0 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

AAAXCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 2460 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TT6TTTTCAG TAATTTCCTC GATTGGTCCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TQATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGQ TTAGACTCAG TAGTAACCTG ACCACCACCT 2820 

GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3060 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

5 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 3240 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3300 

10 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3360 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

IS CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 3480 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3540 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3660 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

25 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3 84 0 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3900 

TCCCX3CCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3960 

30 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4080 

35 TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 4140 

GGTTGAGCTA TGTCAACTT6 AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4380 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

^ TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

SO 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 
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TACTAAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4860 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4920 

5 

ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 980 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 5040 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

10 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

TTGTGATTCT nTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 5280 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 5340 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 5400 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 5640 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 

GGCACACTGT CGAAGTCGAT ATCAATGATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

TTTTCTGTAT CTTCXTTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 5940 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGTeTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

4^ TTAATATCAA CGTGGCTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 6240 

^ tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

SO 

TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

ACGTGACCTG CtTCGCTATC CACAGCAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 6540 
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TGGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 6660 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6720 

5 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 6730 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6840 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6900 

10 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGAC T TCT T TAAATTTAGA AGTATCTGTC 7020 

GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 7140 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 7260 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 7320 

TTTCCATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 73 80 

ACATCAACCT TATCTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 744 0 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 7500 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

30 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 7620 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 7680 

AO^TCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 7740 

35 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 7800 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 7860 

40 CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7980 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 8040 

^ TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCCCATCCCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

SO 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 8280 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 8340 
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ATGTTAATTG ATAATTTTAT TATTTGAAAT 
AACCCTTGTC ACACAAGGCT TGTATTTTTT 
^ ATCTAATTTA AAACAATATA CTAAACGTTT 

AACATGTCTT GAAACGCCTT TCATTACTCT 
GGATTCTGAG TATTTCAGAC GATTTTCTGC 

10 

TTGCAATTAC CTAAAAACAC GTTTACTTAA 
AAATGAAGAT GATACCTGAA ACGGAAATAA 
TTTCTTTTAC AGTTAAACCA AAATATTCTT 

IS 

GAGACAAAAT CACACTACCT GCACCTATCG 
ATGATTGTAA TAATGGTAAG ACAATACCTG 

20 CTAATGCGAT ACGTAGCACA GCTGCAACAA 

TACCTTCAAA CATTTTAGCA ATTGTATTTC 
ATGTACCGCC ACCGCCAATA ATCAATAACA 

25 CTGATTCCAT AATATGATTC ATCTTACGCT 

ATAATACTGC TATTAGCATG GCTGTCCCTG 
ATAGATTTGT AGGTTTGTCA TGCCCAGTTA 

30 

ATATGACTGG TAATGTTGCT GTTAATAAAC 
TAAATTCTTT TTGTGCACCT AACGCTGAAA 
TCATTTTTTG TGCAcTTTGT TAAATATAGG 

35 

AATCATACCA TACAGTAATA CATCTCCAAC 
OGGTCCTGGA TGTGGTGGTA AAAAGCCATG 
TCCTAGTTTT AACACTGAAA CATTTGCGCG 
TAAGACTAAA CCTACTTCAA AGAACAATGC 
TGCCCATTGT ACATGTTTTT GACCAAATTT 
45 ACCACCACCA TCAGCAAGCA ATTTCCCAAG 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC 
ACCTAGCATT AACGCTGTAA TCATCGATGT 

50 

AAACCCAATA ATTAATACTA ATAAAATAAC 
TATTTCGTTA AACATGACAT TCCCCTCTTT 
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ATACCTATAA ATTGTATTCA AGTCATCAGA 8460 

ATACTTATTT TTTAAATTAA ATTCATCATT 8S20 

CATAATTATC GCCTGTACAA TACGCACAAA 8580 

AAAATACCCA ATATACTTTT TATATCGTTC 8640 

ATAAAAATAA ACGTGTTTCA AGGCAATATA 8700 

TATTTAGTTA AACAAATAAG CTAATGAATA 8760 

TCGTTTCTAA TAATGACCAT GTTAAGAATG 8820 

TAAACATCCA AAATCCTGCG TCATTTACAT 8880 

CAAGTACAAC TAATGCAACA TTTACATCTG 8940 

TAGTTGAAAT CGCAGCTACT GTAGCCXSAAC 9000 

TCCATGCTAG TAAAATCGGA GACATCTCTG 9060 

CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

TCATTCCGAT TGGATAAATC GCATTCXTTCA 9180 

TTCTCATTAA TCCCATCGTA ACGATTGCAA 9240 

CTGTTCCTAT CATATAAATG ATAGATTCAA 9300 

CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

TCATACCAAA TCCTGGCATC TCTTGATCCG 9420 

TATCGCCTTC TCGTGTATAC GCAGACGGAA 9480 

CCCTGCAATG AGTGTAACTG GaATGGCAAT 9S40 

ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

TGTCACTGAT AAAGCTGTTA CCATAGGTAG 9660 

TTTTGCTACT GTAAATACTA ATGGAATCAG 9720 

AATACCGACG ATAAATGCTG CAACAAGCAT 9780 

TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 9840 

TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

AATCGTCTCC ATAATTTTAG TCAATGGTAT 9960 

GATAATTAAT GAAATAAATG TATTTAATTT 10020 

GATACCTAAA ACAACACTGA TTAACGGCCA 10080 

CTCTTTTCAA TAGAATGTAA CACCGTCGTC 10140 
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10 



IS 



20 



2S 



30 



35 



40 



4S 



SO 



GAGTGACGTA TTTATTGTGT 
TGTTCATAAT TCTCTGTTAA 
TAAACAGTGA CATTTTCTTC 
ACGATTGAAA AATCTTCAAT 
CATGAACTTT CATAACTTTC 
TGACGCCATA CTTCACTTTT 
TTCATTACTT CAATAAGCGC 
GCAGCGCGAA TCATATG1TC 
GCATTTGCGT TCCAAAGCGG 
TCTGCACCTG GTTTAACACG 
AGACGTTTCG CAGTTTCGAC 
ACACCACCAT TATTTACAGG 
AATATTCTAC CTTTGTAATC 
GTACCGATTG TGACAGCAAC 
ACCCCATCAC TCGCACCAAT 
TAACGTTCTT TCATACCTTT 
TTGGAAATAC CCAGCAGTTC 
ATCCCTGTTG CGGAAGCCAT 
ATGTATGTTT TAATATCTGC 
TTCATCCAAA AAATCTTCGC 
TAAATCGCAT TGCCATCATG 
TCTCCCCAAG TAATATTATT 
TGCATTTGCG CACTAAATGA 
ATAACATATT TAATAGTCAT 
ACATCAACGT TTGGTGTGTG 
TTTTCATCAT ATAAGACTGA 
TTCATGATAA ATCCTTCTTT 
CAACATCGTC GAAATTTAAA 
CTGCATCAAT AAACACTTGA 



TTTATTTTCA GCGATATGTT GGCGTTGAAA ATCTGCAATT 10260 

AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 10380 

GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10500 

CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

TTTTTTATGA GATAAAGTTA AACCX3AAGAA TGAACCTCTT 10680 

CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

CTTTGCAATT TGAGTTAAQA CATCATAAGG ATCAACACCG 10800 

TTCACTCGCT AGCAACTCGT CGCGCAACCA TCTCAATACX3 10860 

ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10920 

AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10980 

TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 11160 

TAATGCCTCA ACATCCCAAT CTAATGTTTC TAAATTAAAC 11220 

TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 11280 

AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 1134 0 

TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 11400 

CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 11460 

TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 11640 

TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

CTTGGTACTC GTCGTTCCAA TGTCGACACC AATCATATAT 11760 

CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11820 

TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA 11880 

TGATTATGAT GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 
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AAAATGAGTT TAAATATTGA TGATTAGATG CTTTGATTAA TGTTTCATGA AATTCAAAGT 12060 

CATGCTTCGT AAATGATTCT GCATCCTCAA ATTTTACTGC CACTTTCATC ATTTCAAGTT 12120 

5 

GTTTCTTCAT TTCTTTTACG ATAGGTAGTC GCTCTTGATT TTTAACTCTT GAAAATGCAA 12180 

ATGACTCTAA CATCAGTCGC AAATCATACA TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

ACACATGTGC ACCCATTCTT TCTAATTGGA TGAGTTGATT TTGTTGCAAT AATTTAAATG 12300 

10 

CATCTCGAAT TGGCGAACGA CTCACATTAA ATTGCTTTGC CATTTGATTT TCAGTGAGTA 123 60 

ACGTACCTTC AGCTATGTGA CCATTCACAA TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

^5 CTCCAGTTGT CATACCTTCC AACCATTTCT CTGGATATCC ATACATCATC AAAGTCACTC 12480 

CTTCATTACA CX3ACATACTT GTATACAAGT ATGTTAATAT AGTTATTATG AGTTTGCAAG 12540 

CGCTTTCTTT ACGAGCACTA AAATAGTGAC CACCCCTTTT CX3ATTTAAAT TTAAAGGAAA 12600 

20 TGGTCACTAT CACACGAATG ATTTAATTGT TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

CTGTACTCAT ATGCX5CTTTA GGTACTTCAA TGCAATAATG CGTTTCATGA CAGTTTGGAC 12720 

ATTCGAATCG ACGTGTTGTC GCTGTATGTT TCGCTTT6AT AACTGCCCAC AAAGATGGTG 12780 

AGAATATATG CTGGCAGTTA GGACATAAAT AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 12840 

CACCAATGCC ATAACCAATC ATAAATGGTA AAGCAATTAA AAACGGCCAT TTATTTTTCA 12900 

TCAAAATTGC ACTTATAATG CTAGAATATT GAATTATTCC TATAATACCA GCACTAATCC 12960 

30 

AAATGTTACG ACGAATACTT TTCATTTCAG CTGATTTACT CATGACATGC TCTATGTCTT 13020 

TTAAGTGTGT GATTGGAGAC GTCGACGCTT CATTTACGTA ATATTGAACA TTTTTAATTT 13 080 

TGTTTAATAC CGCTTGTTGC TGTTTAACTT GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 13140 

35 

GTAAAGTATT GAGCGTCTTC AAAGTACCTT CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

ACGClCAACC TAAATCTTTA AGCAATAAGA TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

40 TATACACACG ACGCTTTCCT TCTGTAAATC CTTGTGGTrT CAAAATACCT TTGCGATCAT 13320 

AATATTGAAT CGTTCGTGTT GTCACATTGC ATAATTTTGC GAGTTCTCCA GTCGAATAGT 13380 

TAGACATAGA TTCCACCTCC TATAATTACC ATAGTTGATG ACCCGACGTC ACGAGCAAGT 13440 

^ ACAATTTCCA CATTTTAAAG AAATTTATTA TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

CCATGTTGAT TTACAAACTC ACTCAAACTA AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

TATTTCAGAA TGAATTTGTT GTAATTTATC TTCAACTTCA GTAATCTCTG TCGCACATTC 13620 

50 

TTTCAGTAAA TCTCGATACT TTTCCGTCTC TGCATTGTTT TTATAACGTA TTTTATGTTC 13680 

TAAACTTGCC CACATATCCA TACCTATCGT TCTAATTTGA ATTTCAACAG GCAATACCTC 13740 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{0} TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCCTTT 60 

TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 120 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 240 

20 TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 300 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG TCGTATAGCT 360 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 420 

25 TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTCCAGT TGCTCCTGTA 4 80 

ATGACATCAC ATATTCAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 600 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GATACTGTGC 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 720 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 840 

GAAlfiTTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 900 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 

TCTCTCCCTT TTTCTTCATC TTTtlATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 

nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 1059 
45 (2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
^ (D) TOPOLOGY: lineaur 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA 


ACAATGGATG 


AAATTGAATA 


TGTCGGGACA 


60 




ATTGTAGGTC CTGCATATCC ACAACAGGAT ATGTTAACTG 


AGTTAAATGG 


ATTTCGCGCA 


120 


5 


TTAACCAAAA TCGATTGGGA AAACGTAACT 


ATCAATAATG 


AAATTACGGA 


TATACGCTGG 


180 




ATTGATAAAG ATAATGATGC GTTGATTGCG 


CCTGCTGTCA 


AAGTTTGGAT 


TGAAACTTAT 


240 


10 


GGTGGTAAAC ATGACAAATA ATGACACCAT 


CATGTTACGA 


CATTATGTCC 


CACAAGATTA 


300 


TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG 


AAGTTTGTTA 


AAACGCCAGA 


360 




GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT 


CCCATCGTTG 


TAATGGATGG 


420 


IS 


CAGGCAATGT GTGGCCTTTT TTACATTACA 


TC6TGGAAAA 


GGGGTCGCAC 


CATTTAGCGA 


480 




TAACCAAGAT GCAGTATTTT TCAGGTCATT 


TAGTGTTGAT 


CAACGTTATC 


GTAATAGAGG 


540 




AATAGGTAAA GTGGTAATGG AAAAATTGGC 


GTCATTTATC 


ACTTCAACAT 


TTCAGGATAT 


600 


20 


TAATGAGATT GT6TTAACGG TTAATACTGA 


CAATCCACAT 


GCCATGGCAC 


TTTATCGCCA 


660 




ACAAGGATAT CAATATATGG GAGATAGTAT 


GTTCGTCGGA 


AGACCTGTTC 


ATATTATGGC 


720 




GTTAACTATA AAATAAATTA AATTTAAAAG 


CATCTTTACT 


CATCGTCGAC 


CACAACAATT 


780 


2S 


AATGATGAAT AAAGGTGCTT TTTGTTATAG 


ATCATCGGAC 


AATTTACTAT 


AGTAAAAAGC 


840 




GACCTAGTGA ACAATTGACA TATATCCACA 


GGTCGCTTAA 


CTTAAGTTAT 


ATTGCTAGTT 


900 




GCGATTAATT GATAGACTCA TCATTTTTGC 


GCTGTCX^GA 


TGGTCTTTTT 


ATTAAAAATG 


960 


30 


CCGTAATCCA AGCCGTAATC GGAATACTGA 


TTGCAACGGC 


AATACCGCCT 


AAAATAATAG 


1020 




AAATAAATTC TTGGGCAAAT ATTTTCGAGT 


TTATAATATG 


ACCAAATGAA 


TATTTAAGTT 


1080 


35 


TGAAAAACCA AATAAATAAA GCAAGTTGGC 


CACCAAAAAA 


GGCAAGGTAA 


ATCGTGTTC6 


1140 


CAGATGTCGC TAAAATTTCT CTACCAACAC 


GCATGCCAGA 


TTGGAAXAAT 


TCGTATTGCG 


1200 




TAACSTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA 


CTAATGGTAA 


TTGTTAAATC 


1260 


40 


TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC 


ATAAATTGAA 


CCATATCAAT 


1320 




GCCAATATTC ATTGAATACA CATATGTTTC 


ATCTTGTTGT 


TCGGTTGaAA 


AGCCTTGTAG 


1380 




ATGACCGAAG TAGACCGATA AATAAATGAG 


TGTAATCAAC 


AATATTGTTG 


TAACGATAgT 


1440 


45 


GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT 






1 cnn 
IdOU 




CGCCAATAAT AATGCAGAAA AAGAATGTGA 


CGACATAAAT 


CGGTACX3CCA 


AAAATAATCA 


1560 




ATACAATACT AATAATTAAA ATAGCGAAAT 


TTAAAAATAG 


GGTTAAATAA 


GAGATOAATC 


1620 


SO 


CCTTTTTACC TCCGAAAATT ATCATCAGAA 


AGAGGAGCAA 


TAACGCCAAT 


ATAAATACAG 


1680 




CATTCATTGT TTCGCCCTCC TTAATGTTTC 


AAATATTTCC 


ATAAACAATA 


TTGTGATAGG 


1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC 
TGCACCGGAT AAATATGAGA ATAATAAGAT 
^ GATGTTTCGC CCAGCAAGCG CCCATCTCCT 

TTCATGCATA CCACTAGCAA TTGTAATTGC 
TAACACTGAG GCTAGGAAGA TATCTTTCGG 

10 

TTTAATGCCT TTACCATCTG TCATATATAT 
AGTTCCGATA ATTGTACTGG CTATGGTAAT 
CAATAAAGTG AGTATTGTTG AACAGATCAT 

IS 

ATTGCTATGT TGAATATGAA TGTAAATTGC 
CGATAAAATC GATTGCA6TC CGACTTTGCG 
ACCAGTGATG ATAACCGTTA AGGTATCACG 

20 

CTTGTTAGAA ATATGTAATA ATACTTTTTC 
CGATTTGACG TACTGATGAT TAATCGTCX3T 

2S TTTGACTTTT AATTGATTTT TATATTTAAT 

TGTCGAAGAA ACATGTTTGA CATCTATAAT 
ATTATTGAAT GTAAATAAAA TAGCACCAAT 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA 

GAATTAATAT GGTGATTATA CGCCCTTAAT 
GTAAAACGAA AATCATCATT GATAGCATCG 

^ CATTAATTGC TGAATAAGTG TTAATAATAT 

TGTiyVTAGCA CATATCGTTC TTTTTAATTC 
TTTAGATTCT GGTAAATGTA TATTTTGTGA 

40 

GAGATACTGC GCAAGTGGTT GGCTACTGAT 
CAATTGTTTT TTTACAGTTT CGGCAAATGG 
CTGAATTAAT GGTGGGTGTG TCGCCATCXST 

45 

ATAGTGCTCT TCGAATAAAG GTAGCATATG 
AAGTTCCGTG AAACCAATGT CTATATTCCC 
so TTCTAATAAG CTCGGTATGA CATGTGTATC 

TAGTAACATG TGGGATACGT CACTCTCATC 



ATTTTTTAAA AAGATTAAAA ACATAGGTAG 1860 

GTTAGTCATT GTTCCCATAA TATCTTGGCC 1920 

CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1980 

AACATCCATA ATAGCGCCAA GTGAACCTAT 2040 

TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

GATTAATATG GCAATAGAAT TCAAGATTAA 2340 

ACCAACCAAT AATACAGTTA ATAAGAACAA 2400 

CTTCTTTTCT ATAATATAAG CATCACTCGG 2460 

GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2520 

CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

ATCACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

GAATGCGATG CAGAACAAAC CTAAAATTAT 2760 

TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2820 

TTTTTATTTT CAAAGATATT ACTGCTAAGT 2890 

AATTACTTAA TGGAATGTAG ACGTTTTAGT 2940 

GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

AGTATGATCT AATTTTATAT CTATCCATGA 3 060 

TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

AATTGGATCG TCTGAAGGCG CATATAAATG 3300 

TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3360 

ATTTAATACG CTATTTATAA TTGTGTCATG 3420 

ATTTTGTAAA TGAAACGTTT GGATAAGTGG 3480 

ATAGCCAATG TAGATACTTT TATTTTTAGT 3540 
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TTCATTAAAT AATAATTTCC CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3660 

TAAAGACACA TTAAGTTCTT GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

5 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3780 

TCTTAATTGT TTAATTTCCA TAGCGATATA GGCACCTCCA AAAATGAGTG TTTTGTAACT 3 840 

ATTATAGCAA TATTATTGAT AAATGTTCTA T T TT T TAGAT GAATATCTTC TATTTTATAT 3900 

10 

ATTGAACAGA TAAATTTTTT AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3960 

TCTAAAAAAG GGGTGTGCAT CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

CGTGAGGAAC ATGATGCGTG TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4080 

CACGACATCA TTGATAAATC GCTTGAAATG TTGCGACGCT TAGATCACAG GGGCGGGGTC 4140 

GGCGCAGATG GCATCACTGG TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 4200 

20 TTCAAACAAC ATGTAACGGA CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4320 

GAAGGCGAAG GGTTATCAAT TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 4330 

2S GCTAAACATG TAGCAGATAC GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 4440 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4500 

CAGTGCGATT TAGAATTGTA TTTTACGAGC TTATCACGCA AAACAATTGT ATATAAAGGT 4560 

TGGTTACGAT CAGACCAAAT TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4620 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4680 

GCACATCCTA ACCGTATGTT AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

3S 

AACTGGATGC GAGCACGCCA ACATAAATTA ATCGAAACAT TATTTGGCXSA GGATCAACAT 4800 

AAAOTGTTTC AAATTGTCGA TGAGGATGGT AGTGACTCTG CCATTGTAGA TAATGCGCTA 4860 

GAGTTCTTAT CGTTAGCCAT GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4920 

40 

TGGTTATATA ATGAAGCGAA TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4980 

TTAATGGAAC CGTGGGATGG TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 5040 

GCGCTTACAG ATAGAAATGG ATTACGTCCA GGTC6TTATA CGATTACTAA AGATAACTTT 5100 

43 

ATTGTCTTTT CATCTGAAGT GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GGTCAATTGA ATCCTGGAAA GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

SO AATAATGATT TAAAAGGTGC GATTGCTOGA GAATTACCAT ATAAAGCGTG GATTGATAAC 5280 

CATAAAGTTG ACTTTGATTT TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCGCCAATT 5460 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCX3 TAACGAGTGA ACTTTCTTAT 5580 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 5640 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

IS GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 5940 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6000 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 6060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATO 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 6180 

CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 6240 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTGCATCA 6300 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAATCTATT 63 60 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 6480 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAAOGCTT TAATACAGGG 6540 

GCGATGAGTT ATGGATCGAT TTCAGCGGAA GCACATGAAA CGTTAGCACA AGCCATGAAC 6600 

CAAOarAGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

4S AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATO OGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGCA 7020 

50 TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA 
TTAGTGGTGT TGGGCTGTAT TATGATGCGT 
^ GTTGCAACTC AAAACAAAGA TTTACGTGCT 

AATTTTATGC ATTTTATTGC ACAAGAATTA 
CGTGTAGAAG ACTTAGTTGG AAGAACTGAT 

10 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA 
ACAAAAGAAA TTCAACAAAA TCATAATCTT 
GAAGTAACGA AGCCATATAT TGCTGAAGGG 

15 

AATGAACAAC GTGATGTAGG GGTTATTACA 
GCAGGACTTC CTGAAAATAC AATTAATGTT 

2^ GCAGCATATG CACCX5AAAGG CTTAATGATT 
GGTAAAGGAT TATCTGGTGG TACGGTCATT 
GAAATTATTG CTGGTAAOGT CTCATTCTAT 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA 
ATCGGCGACC ATGGATTAGA GTATATGACT 
GGTAAGAACT TCGGTCAAGG TATGAGTGGT 

30 GAAGCTTTTG TTGAAAATAA TCAACTAGAT 
GAAGAAAAAG CATTCATTAA GCAAATGCTG 
AGAGCGATTC ATGTGTTAAA ACATTTTGAT 
CCTAAAGA1T ATCAATTAAT GATGCAAAAA 
GAAGATGAAG CGATGTTAGC TGCATTTTAC 
AAACCAGCCG TTGTGTATTA AGGAAAGGGG 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT 
AAGCATATCA ACAACGATTT ACTAAAGAAG 
ATTGTGGAAC GCCGTTTTGT CAAACCGGAC 

45 

CAATTGGAAA CTACATTCCT GAATGGAACG 
CTTATGAACG CTTAAGCGAA ACAAATAACT 
5^ CACCATGCGA AAGTGCTTGT GTGATGAAGA 
TTGAACGCAC AATTATTGAT GAAGCTTTTG 



GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

TTATATAGAG GTAAAGCACA TCATGTTGTT 7380 

AGAGAAATTT TAGCATCTTT AGGTTTGAAA 7440 

TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7560 

GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

CGTOGCTATA CAGGTAGCTT TACAGTAAAT 7680 

GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

CATCATACTG GAGATGCGAA TGACTATGTT 7860 

GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GGTGCGACAA GTGGTAAGGC ATTTATTAAC 7980 

AATAGTGGTG TAGATGTTGT CGTTGAAGGT 8040 

GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTATTCCTT ACGTTATCCC GTCTGATGTA 8160 

ACGCTTTCGT TTACAAAGAT TAAACACCAA 8220 

GAAGAACATQ TGTCACACAC GAATAGTACG 8280 

CGCATTGAAG ATGTCGTCGT TAAAGTTATT 8340 

ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

ATGCCTCTAT CCAAGGTOCA CGATGTATGG 8640 

AACAGTATGG TAGOGAAACA ATAGGTTGTC 8700 

ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

TTAATAGAGA ATGGATTGCG ATTAAAGGTA 6880 

AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 



55 



426 



EP0 786 519 A2 



CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 9060 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 9120 

^ GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 9180 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 9240 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

10 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAGCGA 9420 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 9480 

75 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCC5GGTG TTTAAAATGG 9540 

ACTATGCGCA CCAAGA6TAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGGG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGTAT ACTCAAATTT 9660 

20 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 9720 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCQ AATGCTTTTA 9780 

25 ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9840 

AAAAGGTATT TGCTGCTGGA GATGCTAGAC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGC6TAGCG AAAGCAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9960 

30 AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTCAATC GAGTTTGAAA 10020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCT TT GTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACOTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

40 

CCATCT^TAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 10380 

TTACCTTTTT TATTTGTCTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10500 

45 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGTAGCGAG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 10680 

50 

TGAAATGGCT ACCCCAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10740 
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TAAAAAGAGA AGATGTAAAA GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 10860 

CTGCAACGCA TTGTGTAACA CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10920 

5 AAGACGCATT AAGTAATAAC GCGTTGGTCA AGGGGCAGTT TAAAGCAGAC CATCAATATC 10980 

AAATTGTCAT TGGTCCAGGA ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 11040 

GTGCTCAAGA AGCTTCGAAA GATGAAGCGA AACAAGCAGC TGCACAAAAA GGGAATCCAG 11100 

TACAACGTTT GATCAAATTG TtGGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

CAGCTGGTTT GTTAATGGGA ATCAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

AAGCACTTAT TGAGATGTAT CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 11280 

IS 

CGAGTACGGC ATTTATTTTC TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 11340 

GTGGTAGTCC GATTCTAGGC ATAGTCTTAG GTTTGATTTT AATGCATCCG CAATTAGTAT 11400 

CTCAGTATGA TTTGGCAAAA GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 11460 

20 

AGCAGTTGAA TTACCAAGGT CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 11520 

AAATTGAAAA AGGATTAAAT AAAGTCGTTC ACGATTCGAT AAAAATGTTG GTCGTTGGAC 11580 

CCGTAGCGCT TTTAGTTACT GGATTTTTAG CATTTATTAT CATTGGACCA GTTGCGTTAT 11640 

TGaTTGGTAC AGGTATTACA TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 11700 

GCGGAGCAAT ATATGGATTG TTATATGCAC CACTT6TAAT TACAGGACTA CACCATATGT 11760 

^ TTTTAGCAGT AGATTTCCAA TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 11820 

TTGTTGCGAT TTCCAATATT TGTCAGGGCT CTGCAGCATT TGGAGCATGG TTTGTCTATA 11880 

AACGTCGTAA AATGGTTAAA GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 11940 

35 TAGGTGTTAC TGAACCAGCC ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12000 

CTGQGATATC AACX3TCTTGT GTATTCGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12060 

AAGTTGGTGT TGGTGGCGTG CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 12120 

ATCTTATTGT GACAGCTATT GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 12180 

ATTTTAGTAA ACAAAAAGCG AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 12240 

TCGTTATTTG GACGTCCTTT ATTACX3TTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 12300 

45 

ATTGGAGAAA ATCCGTTGTA TATCAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 12360 

ATGGTATAGG AGATATCAAT GGAATTATAG AAAAATTGGA TTATATCAAG TTATTGGGTG 12420 

TTGATTATAT TTGGTTAACA CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 12480 

SO 

TCAGCAATTA TTTAGAAATC aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12540 
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CGACGGAGCA TGaATGGTTT AiU^GAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 12660 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

GTAATGCATG GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACX3TCA ATCGTTATAT CGCATAGTCA 12840 

ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA 12900 

AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 1302 0 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TQATTATGTT GATGGTGAAA 13140 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

GAGGTATTTA TGACGGTGGC GGATGGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13260 

GGGTAGTGTC TAGATTTGGT GATGATACX3T CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA 13380 

TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13500 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13560 

GATTTACAGC TGGTAAnCCT TGGATTGATA TTTCGGAAAA TTATCATCAG GTCAACGTTA 13620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 13680 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATTTATTT6T TTATGAACGT CATTATAAGA ATCAACAATO OCTAGTAATT GCGAATTTCT 13800 

CAGCaTCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13860 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

ATTTGTTGGC ATTAGACGGC ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

ATCAGGAGGT TACAGAGTTT CGATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14220 

AAATOGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14280 

TTCCAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 14340 
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TTGTTTCAGA TATAGGTAAT GATGTTGCX3A 
TATTAAATCT TAATATTAGT TATTCAAGTA 
5 AAGCATATCA ATTGTTTGGT GATGTATCGG 

TGTATTTAGA AAATACAATG CCGTTTCAAT 
TTAAATTTAA TGACTTCTCA AGACGTCGTA 
CTTGCAATTA ACTATTAAAA TATAGTAATA 
CGGTTCCCTG TACTCGAAAT CCGCTTTATG 
TTTTGCGAAG TCTGCCCAAA GCACGTAGTG 

75 

CCCATGAACC ATGTCAGGTC CTGACGGAAG 
AGGgTAGCCG AGATTTAGCT AACGACTTTO 
GGTGCACGGT TTTTTATTTT TTAAATATTA 

20 

TTATAGAAGC TACTTTCTT6 AAGACAATTC 
AAGTAGCTTT TTTATATGTG PAGTTTGpLrT 
TTTTGTGTCA ATGAAAAGTA AGAAGTTATA 

2S 

AGGGGGAGTA TCTTACAATA GAATTATTAA 
TGCCTACGGA GGACATATGC AAATATATTT 
ATCTTTAAAT AGTATTGAAG AAAGTTTTGA 
TGCGAAA6TA AAACATTTAA GAAAATCTCC 
GAAAAATGAA AATAACGATG TCGTTGGACA 
35 TGATGATAAG ACGTATTATG GTTTGGCGAT 
TGGACAAAAA TTAGGTCGTG GCTTGGTTCA 
GTATAGTAOG GTTGTTGTAG ACCATTGTTT 
TGCTGCTGAG CATGACATTA AATTAGAATC 
ATGGGATAAT TTGACGGATG CACCACACGG 
ATTGTTCAAT TAAGAAGTAA AGGTATTATC 

45 

GGTGCTAACT TGAATTATCA AGCCTTATAT 
GTCGTCGGAC AAGAACATGT CACGAAGACA 
TCGCATGCTT ATATTTTTAG TGGTCCGAGA 

SO 

TTTGCTAAAG CAATCAACTG TCTAAATAGC 
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GTGATTCTAT TTATGATTAT TTGGAAAAGG 14460 

AGTCTATTAC TTTTGAACCG TTTGATGAAC 14520 

T6GCTTATTC AGCAACAGTT CGAAGTATTG 14580 

ATAATATTTC AAAACATCTT GCAAATGAAT 14640 

TAAAGTAAAC AATGATATAA ATGATTTATA 14700 

TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14760 

CGAGGCTTAA TTCCTTTGTT GAGGCCGTAT 14820 

TTTGAAGATT TCGGTCCTAT GCAATATGAA 14880 

CAGCATTAAG TGGATCATCA TATGTGCCGT 14940 

GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

AACCGATTAT TAAGAGTTGA AAATATATAA 15060 

AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

ATTTGATGAT AAAGAAATGA TGGT6AAATG 15240 

TGAGATACX3T TATGATTATT GACAATCAAA 15300 

AAGTACTTTA ACAGAGTTAG ATTATGATAA 15360 

TGATAATCCT GAAACGAGTT GGCAAGCACG 15420 

TTGCTATAAT TTTGAATTAG AAGTAATAGC 15480 

OGTTTTATTA ATTGAAGTAG AAATTAATAG 15540 

TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

AGCAGTAGAA GAGCGTGCCA AAGCACAAGA 15660 

TGACTACTTT GAAAAGTTGG GTTATCAAAA 15720 

TGGTGATGCA CCGTTACTTG TAAAATATTT 157 BO 

AATCGTAAAA TTTCCAGAAC ATTTTTATTA 15840 

ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

CGTATGTACA GACCCCAAAG TTTCX3AGGAT 15960 

TTGCGCAATG CGATTTCGAA AGAAAAACAG 16020 

GGTACGGGGA AAACGAGTAT TGCCAAAGTG 16080 

ACTGATGGAG AACCTTGTAA TGAATGTCAT 16140 
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AATAATGGCG TTGATGAAAT AAGAAATATT AGAGACAAAG TTAAATATGC ACCAAGTGAA 1S260 

TCGAAATATA AAGTTTATAT TATAGATGAG GTGCACATGC TAACAACAGG TGCTTTTAAT 16320 

GCCCTTTTAA AGACGTTAGA AGAACCTCCA GCACACGCTA TTTTTATATT GGCAACGACA 163 80 

GAACCACATA AAATCCCTCC AACAATCATT TCTAGGGCAC AACGTTTTGA TTTTAAAGCA 15440 

ATTAGCCTAG ATCAAATTGT TGAACGTTTA AAATTTGTAG CAGATGCACA ACAAATTGAA 16500 

TGTGAAGATO AAGCCTTGGC ATTTAtcgCT AAAGCGTCTG AAGGGGGTAT GCGTGATGCA 16560 

TTAAGTATTA TGGATCAGGC TATTGCATTT GGTGATGGTA CGTTAACATT GCAAGATGCG 16620 

TTGAATGTCA CAGGTAGCGT ACATGATGAA GCGTTGGATC ACTTGTTTGA TGATATTGTA 16680 

CAAGGTGACX5 TACAAGCATC TTITAAAAAA TACCATCAGT TTATAACAGA AGGTAAAGAA 16740 

GTGAATCGCC TAATAAATGa TATGATTTAT TTTGTCaGAG ATACGATTAT GAATAAAACA 16800 

TCTGAGAAAG ATACTGAGTA TCGAGCACTG ATGAACTTAG AATTAGATAT GTTATATCAA 16860 

ATGATTGATC TTATTAATGA TACATTAGTG TCXSATTCGTT TTAGTGT6AA TCAAAACGTT 16920 

CATTTTGAAG TGTTGTTAGT AAAATTAGCT GAGCAGATTA AGGGTCAACC ACAAGT6ATT 16980 

GCGAATGTAG CTGAACCAGC ACAAATTGCT TCATCGCCAA ACACAGATGT ATTGTTGCAA 17040 

CGTATGGAAC AGTTAGAGCA AGAACTAAAA ACACTAAAAG CACAAGGAGT GAGTGTCGCT 17100 

CCTGTTCAAA AATCTTCGAA AAAGCCTGCG AGAGGCATAC AAAAATCTAA AAATGCATTT 17160 

TCAATGCAAC AAATTGCAAA A6TGCTAGAT AAAGCGAATA AGGCAGATAT CAAATTGTTG 17220 

AAAGATCATT GGCAAGAAGT GATTGATCAT GCCAAAAATA ATGATAAAAA ATCACTCGTT 17280 

AGTITATTGC AAAATTCGGA ACCTGTGGOG GCAAGTGAAG ATCACGTACT TGTGAAATTT 17340 

GAGGAAGAGA TCCATTGTGA AATCGTCAAT AAAGACGACX3 AGAAACGTAG TAGTATAGAA 17400 

AGTCTTGTAT GTAATATCGT TAATAAAAAC GTTAAAGTTG TTGGTGTACC ATCAGATCAA 17460 

TGGCAAAGAG TTCGAACGGA ATATTTACAA AATCGTAAAA ACGAAGGCGA TGATATGCCA 17520 

AAGCAACAAG CACAACAAAC AGATATTGCT CAAAAAGCAA AAGATCTTTT CGGTGAAGAA 17580 

ACTGTACATG TGATAGATGA AGAGTGATAC ATGACAAGCG ATATAATCGT ATGTATAATG 17640 

AAAGAAACAT CATTTTATTG ATAAATATTT ATTGATTTTC AAGGAGGAAA TGGAATATGC 17700 

GCGGTGGCGG AAACATGCAA CAAATGATGA AACAAATGCA AAAAATGCAA AAGAAAATGG 17760 

CTCAAGAACA AGAAAAACTT AAAGAAGAGC GTATTGTAGG AACAGCTGGC GGTGGCATGG 17820 

TTGCAGTTAC TGTAACTGGT CATAAAGAAG TTGTCGACGT TGAAATCAAA GAAGAAGCTG 17880 

TAGACCCAGA CGATATTGAA ATGCTACAAG ACTTAGTGTT AGCAGCTACT AATGAAG06A 17940 



SS 



431 



TCCCTGGaAT GTGATCATAG ATGCATTATC 
TTATGAAATT GCCAGGCATT GGTCCAAAGA 
5 ATATGAAAGA AGACGATGTT GTTCAGTTTG 

TAACATATTG TAGCGTATGT GGTCACATTA 
ATAAGCAAAG AGATCGTTCA GTTATTTGTG 
TGGAAAAAAT GAGAGAATAC AAAGGTTTAT 
TGGATGGCAT TGGACCAGAA GATATTAATA 
ATGAAGTTAG CGAATTAATC TTAGCTATGA 

IS 

TGTATATTTC TAGATTAGTT AAGCCTATAG 
TATCXK3TAGG TGGCGATTTA GAGTATGCTG 
GTAGAACAGA AATGTAATkT CTTCTATTAA 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT 
GCGGTGTGGC GGTGGTATGG TTTACCTAGT 
CAAGCCGTTG GTTGTGATTT GTTACTTCTA 

25 

TAGATCTATG GTTATGGTGT GTTGGTGCTA 
CAAATGAAAT TCTTTTGTAA TTGAAAT6AT 
GGTCTAAAGC TTATTAAATC AGCCTGTATA 
TAAATTTATT TTTAATTTCT GGTAAAAAAA 
ATATGGTTAG AGAAAAATCT GTTTCTTGTT 

35 TTTTTAAGTT CGATTTTTAG GATAAGGGCG 
ACTGXTGTTA AGCAGTTTGA AAGCCTGTAT 
CTCAACTTAA GAAATAACTT GAATTACTAA 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT 
AATGACAATA TGTCAACX;TT AATTCCAAAA 
GTATTTATGA GCTAATCAAA CATCATAATT 
GAACGCTGGC GGCGTGCCTA ATACATGCAA 
CTGATGTTAG CGGCGGACGG GTGAGTAACA 
ACTTCGGGAA ACCGkAGCTA ATACCGGATA 

SO 

AGACGGTCTT GCTGTCACTT ATAGATGGAT 
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CAGAACCTAT ATCAAAACTT ATTGATAGCT 18060 

CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

CCAAAGCATT AGTAGATGTT AAGAGAGAAT 18180 

CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

ACX5AAGTAAC ATTATCTAAA GCAATCGCAG 18600 

ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

ATAGTAATGA TGTGAATTGG ATTATCGAAT 18840 

TTAATTTGAT AAATGCGGTT AATGACTATG 18900 

AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18960 

GCGGTGTTTT GAGAGATTAT TTAAAACTTG 19020 

TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

CTAAAAAACG TACTATTTAT AAGTGGGGAT 19140 

TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

CCGCX3CTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA 
gCtGaCGGAG CAACGCCGCG TGAGTGATGA 
5 GGGAAGAACA TATGTGTAAG TAACTGTGCA 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT 
TGGGCGTAAA GCGCGCGTAG GCGGTTTTTT 
GTGGAGGGTC ATTGGAAACT GGAAAACTTG 
GTAGCGGTGA AATGCGCAGA GATATGGAGG 
TGTAACTGAC GCTGATGTGC GAAAgCGTGG 

IS 

CCACGCCGTA AACGATGAGT GCTAAGTGTT 
AACGCATTAA GCACTCCGCC TGGGGAGTAC 
GGGGACCCGC ACAAGCGGTG GAGCATGTGG 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA 
GACAGGTGGT GCATGGTTGT CGTCAGCTCG 
CGAGCGCAAC CCTTAAGCTT AGTTGCCATC 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG 
TACACACGTG CTACAATGGA CAATACAAAG 
CATAAAGTTG TTCTCAGTTC GGATTGTAGT 
CTAGTAATCG TAGATCAGCA TGCTACGGTG 
CGTCACACCA CGAGAGTTTG TAACACCCGA 

j5 CGTCGAAGGT GGGACAAATG ATTGGGGTGA 

GCGtgTTGGAT CACCTCCTTT CTAAGGATAT 
ATAACGTGAC ATATTGTATT O^GTTTTGAA 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA 
TACATTGAAA ACTAGATAAG TAAGTAAAAT 
AAAGAGTTTT AAATAAGCTT GAATTCATAA 

^ CACAAGATTA ATAACGCGTT TAAATCTTTT 

TGACTTATAA AAATGGTGGA AACATAGATT 
GGCACTAGAA GCCGATGAAG GACGTTACTA 

50 

AGCTTTGATC CAGAGATTTC CGAATGGGGA 



GCAGTAGGGA ATCTTCCGCA ATGGGCGAAA 19860 

AGGTCTTCGG ATCGTAAAAC TCTGTTATTA 19920 

CATCTTGACG GTACCTAATC AGAAAGCCAC 19980 

ACGTAGGTGG CAAGCGTTAT CCGGAATTAT 20040 

AAGTCTGATG TGAAAGCCCA CGGCTCAACC 20100 

AGTGCAGAAG AGGAAAGTGG AATTCCATGT 20160 

AACACCAGTG GCGAAGGCGA CTTTCTGGTC 20220 

GGATCAAACA GGATTAGATA CCCTGGTAGT 20280 

AGGGGGTTTC CGCCCCTTAG TGCTGCAGCT 20340 

GACCGCAAGt TGAAACTCAA AGGAATTGAC 20400 

TTTAATTCGA AGCAACGCX»A AGAACCTTAC 20460 

GATAGAGCCT TCCCCTTCGG GGGACAAAGT 20520 

TGTCGTGAGA TGTTGGGTTA AGTCCCX5CAA 20580 

ATTAAGTTGG GCACTCTAAG TTGACTGCCG 20640 

TCAAATCATC ATGCCCCTTA TGATTTGGGC 20700 

GGCAGCGAAA CCGCGAGGTC AAGCAAATCC 20760 

CTGCAACTCG ACTACATGAA GCTGGAATCG 20820 

AATACGTrcC CGGGTCTTGT ACACACCGCC 20880 

AGCCGGTGGA GTAACCTTTT AGGAGCTAGC 20940 

AGTCGTAACA AGGTAGCCGT ATCGGAAGGT 21000 

ATTCGGAACA TCTTCTTCAG AAGATGCGGA 21060 

TGTTTATTTA ACATTCAAAT ATTTTTTGGT 21120 

GTATGCGAGC GCTTGACTAA AAAGAAATTG 21180 

ATAGATTTTA CCAAGCAAAA CCGAGTGAAT 21240 

GAAATAATCG CTAGTGTTCG AAAGAACACT 21300 

TATAAAAGAA CGTAACTTCA TGTTAACGTT 21360 

AAGTTATTAA GGGCGCACGG TGGATGCCTT 21420 

ACGACGATAT GCTTTGGGGA GCTGTAAGTA 21480 

AACCCAGCAT GAGTTATGTC ATGTTATCGA 21540 
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GAGGAAGAGA 
ACCAACAAGC 
TTAGACGAAT 
TGTCTCTCTT 
AGGACCATCT 
GAAAG6TGAA 
GTAGTCAGAG 
GATTTGATGC 
CGTTTAGTAT 
CAGGTAACAC 
GGGTAGCGGA 
GGGCTAGCCT 
CGGGTTACCX3 
TGGGTGATAA 
AThThTGTTA 
GCAGCCATCA 
GTACCGGGGC 
CGTTCTAAGG 
CCGGTGTGAG 
AGGAAGGCTC 
TGGA7AACAG 
tAGGATAGGC 
AAATCOGGTA 
TTGATTTCAC 
GACACAGGTA 
GGCAAAATGA 
GCCGCAGTGA 
AGGTGATGTA 
CTGCGAAgCT 



AAGAAAATTC 
TTGCTTGTTG 
CATCTGGAAA 
GAGTGGATCC 
CCTAAGGCTA 
AAGCACCCCG 
CCCGTTAATG 
AAGGTTAAGC 
TTGGTCGTAG 
TGAATGGAGG 
GAAATTCCAA 
CAAGTGATGA 
AATTCAGACA 
GGTCCGTGTT 
AGTG6AAAAG 
TTTAAAGAGT 
TAAACATATT 
GCX3TTGAAGC 
TAGCGAAAGA 
GTCCGCTCTG 
GTTGATATTC 
GAAgcGTGcG 
CTCGTTAAGG 
ACTGCCGAGA 
GTCAAGATGA 
CCCCGTAACT 
ATAGGCCCAA 
TagGGcTGAC 
ACGAATCGAA 



GATTCCCTTA 
GGGTTGTAGG 
GATGAATCAA 
TGAGTACGAC 
AATACTCTCT 
GAAGGGGAGT 
GGTGATGGCG 
AGTAAATGTG 
ACCCGAAACC 
ACCGAACCGA 
TCGAACCTGG 
TTATTGGAGG 
AACTCCGAAT 
CGAAAGGGAA 
GATGTGGCGT 
GCGTAATAGC 
ACCGAAGCTG 
ATGATCGTAA 
CGGGTGAGAA 
GGTTAGTCGG 
CTGTACCACC 
ATTGGATTGC 
CTGAGCTGTG 
AAAGCCTCTA 
GAATTCTAAG 
TCGGGAGAAG 
GCGACTGTTT 
GCCTGCCCGG 
GCCCCAGTAA 



GTAGCGGCGA 
ACACTCTATA 
AGAAGGTAAT 
GGAGCACGTG 
AGTGACCGAT 
GAAATAGAAC 
TGCCTTTTQT 
GAGCCGTAGC 
AGGTGATGTA 
CTTACGTTGA 
AGATAGCTGG 
TAGAGCACTG 
GCCAATTAAT 
ACAGCCCAGA 
TGCCCAGACA 
TCACTAGTCXr 
TGGATTGTCC 
GGACATGTOG 
TCCCGTCCAC 
GTCCTAAGCT 
TATAATCGTT 
ACGTCTAAGC 
ATGGGGAGAA 
GATAGAAAAT 
GTGAGCGAGC 
GGGTGCTCTT 
ATCAAAAACA 
TGCTGGAAGG 
ACGGCGGCCG 



GCX^AAACGGG 

CGGAGTTACA 

AATCCTGTAG 

AAATTCCGTC 

AGTGAACCAG 

CTGAAACCGT 

AGAATGAACC 

GAAAGCGAGT 

CCCTTGGTCA 

AAAGTGAGCG 

TTCTCTCCGA 

TTTGGACGAG 

TTAACTTGGG 

CCACCAGCTA 

ACTAGGATGT 

AGTGACACTG 

TTTGGaCAAT 

AGCGCTTAGA 

CGATTGACTA 

GAGGCCGACA 

TTAATCGATG 

AGTAAGGCTG 

GACATTGTGT 

AGGTGCCCGT 

GAACTCTCGT 

TAGGGTTAAC 

CAGGTCTCTG 

TTAAGAGGAG 

TAACTATAAC 



aagagcccaa 
aaggacgaca 
tcgaaaatgt 
ggaatctggg 
taccx;tgagg 
gtgcttacaa 

GGCGAGTTAC 
CTGAATAGGG 
GGTTGAAGTT 
GATGAACTGA 
AATAGCTTTA 
GGGCCCCTCT 
AGTCAGAACA 
AGGTCCCAAA 
TGGCTTAGAA 
CGCCGAAAAT 
GGtAGGAGAG 
ASTGAGAATG 
AGGTTTCCAG 
GCGTAGGCGA 
GGGGGACGCA 
AGTATTAGGC 
CTTCGAGTCG 
ACCGCAAACC 
TAAGGAACTC 
GCCCAGAAGA 
CTAAACCGTA 
TGGTTAGcTT 
GGTCCTAAGG 



21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 



SS 



434 



EP0 786 519 A2 



TGTCTCAACG AGAGACTCGG TGAAATCATA 
AGGACGGAAA GACCCCGTGG AGCTTTACTG 
^ TACAGGATAG GTAGGAGCCT TTGAAACGTG 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC 
TCAGGCGGGC AGTTTGACTG GGGCGGTCXSC 

10 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA 
GAGACCTACA AGTCXSAGCAG GGTCGAAAGA 
AAGGGCCATC GCTCAACGGA TAAAAGCTAC 

75 

AGTTCACATC GACGGGGAGG TTTGGCACCT 
AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC 
CGTCGTGAGA CAGTTCGGTC CCTATCCGTC 

20 

CTTAGTACGA GAGGACCX^GG ATGGACATAC 
ATAGCTGGGT AGCTATGTGT QGACGGGATA 
CCTCAAGATG AGATTTCCCA ACTTCGGTTA 
GGTTCGAGGT GGAAGCATGG TGACATOTGG 
AATCAAAATA AATGTTTTGC GAAGCAAAAT 

^ ATAAATTACA TTCATATGTC TGGTGACTAT 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT 
AACGTTGCCA GGCAAAAAAT GGATGCGATG 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA 

AAACCTTTGA ATCTGACGAA AOGAGAAAAG 
TAA($yGAGAG CCGAAGrAGA GGAAAGAAGC 
TAGCGASGAT GGTAGCCAAC TTACGTTCCG 
AATGTACACT TTCGATTGTC TAAGTATGTA 

AAATGATATC ATCGAAAACA AAATATTGTA 

4S 

AATTGAAAAT GATCTTACTG CTCTTTTATA 

TTATTATACA ATAGACAAGC TATTGCATAA 

CTTTATAATT AATGATTTTA TTAGAGCGTC 

SO 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT 



GTACCTGTGA AGATGCAGGT TACCCGCGAC 23460 

TAGCCTGATA TTGAAATTCG GCACAGCTTG 23520 

AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 23580 

CCGCACCACT TATCGTGGTG GGAGACAGTG 23640 

CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 23700 

GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23760 

CGGACTTAGT GATCCGGTGG TTCCGCATGG 23820 

CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23880 

CGATGTCGGC TCATCGCATC CTGGGGCTGT 23940 

ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24060 

CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

TAAGATCCCT CAAAGATGAT GAGGTTAATA 24240 

AGCTGACGAA TACTAATCGA TCGAAGACTT 24300 

CACTTTTACT TACTATCTAG TTTTGAATGT 24360 

AGCAAG6AGG TCACACCTGT TCCCATGCCG 24420 

GGTAGTcGAA CTTACGTTCC GCTAGAGTAG 24480 

AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24540 

AACACAAAGA AAAATGGCTT GGCGAAGTGA 24600 

ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24660 

AAGC6ATTGT CACAAGTCAA GAAAGGTTCT 24720 

CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24780 

CAACTTTAAT TTTGTGTTTA TATAAATTTA 24 84 0 

TAAATAGAGA AGAGCAGTAA GACGGTATCT 24 900 

TACTTTATTG AAATACAAAA AGGAAATTAA 24 960 

GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

TTTCTTATAG TCTTGATCAT CATCAAAATT 25140 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

CTCTTTGTAA GGTTTCAATC TTTTTAAAAT ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

5 CATTTTATTT AAATGCCTTT CAAAACCACC GGAAGATATA AACGTTGCAA TAAGGTTTTG 25330 

CATATGAACA GGTACAGTGT TGCCTTCAAT GTGATTTTGA GAATGATATT TTTTCATTAT 25440 

AGAATAGGGT AACACCATAT ATGCAACTCG ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ACTGATATAA ATCACTTTTT CTCCTCTTGA ATATAGACCT TGAATTGCTG GAATGGGTTT 25560 

GCCGAAATAT CTAAACTCGG AATCATAATC ATCTTCTATA ATAAATCGTT CTTCTTTTTC 25620 

TTGAGCCCAT TGTATTAATT GAGTTCGTTT TTTTAAGTCC ATCACATATC CAGTTGGAAA 25680 

75 

TTGATGGGAA GGCGTTATAT ATACTATATT TTTTTGTGAT TTAATAACTT CATCTACGTT 25740 

TATTCCATTA TCTTCAACTT CAATTTGTTC ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

TTTGATTGGT GGATAACTAG GTTTTTCGAT AATAAATGTT GAAGTATAAA GTAAATCGAC 25860 

20 

TAATTGATTT ACTAATTGTT CGGTAGATGA GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

TACGCCACGA TTAGTAAATA AATAAAATGC CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25980 

AAAATGTCCT CTACGTAATT GATTTAAATG ATTTGTATCA TAAAGATCTT TGGAATACTT 26040 

25 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT ATCTATTTCA TCCAAATTAA AAGCATAATC 26100 

ATAAGCTTCA TCACTCGCTT TTGGTTTATA TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

3^ TTGATTGTTT AAAATTGTTA AAGATTCAAT TTCXSGACACA AAATATCCAG AGCGAGGTCT 26220 

TGAATAAATG TAACCTTCGT CTAATAGAAG TTGATATQCA T0CTCTACXK3 TTGTTTGGCT 26280 

AATAGATAAA TGTTTGCTTA ATTGTCTTTT AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

55 ACCTTCAATT ATTTGTTTTT TTAATTTTTC ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 26400 

TTTXATAACT GACCTCCTAA ATTTATCTTA TTTTGTACCT TTTTAAATAT CAGTTTATAC 26460 

ATTACAATGT ATTTAATCAA CTTGAAAAGG GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 

^0 ATCAGACAGA GTCAAAAGAG GTATGGCTGA AATGCAAAAA GGCGGCGTTA TTATGGATGT 26580 

CGTTAATGCT GAGCAAGCAA GAATTGCAGA AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

AGAACGAGTA CCTTCTGATA TTAGAGCTGC TGGTGGTGTT GCACGTATGG CAAACCCTAA 26700 

^ AATTGTAGAA GAAGTAATGA ATGCTGTTTC TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

TCATATCACT GAAGCAAGAG TATTAGAGGC GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

AGTGTTAACA CCAGCAGATG AGGAATATCA CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

SO 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC 
TGCGAAAGAT ATCGGTGCGC CTTATGAAAT 

5 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT 
GGAATTAGGT GCTGACGGTG TATTCGTTGG 
AAAATTTGCT AAAGCAATTG TTCAAGCAAC 

10 

AAGATTAGCA AGTGAACTTG GCACTGCTAT 
AGAAGAACGT ATGCAAGAGC GTGGTTGGTA 
AGGTGCAGTA CGTGAACATA TTAGACATAT 
TAAAAAAGTT GAACAATTAG AAGAAATCX^A 
AACGTTACGT CGATTAATGA ATTTATATGG 
20 ACCTATGTTT GGTACATGCG CAGGATTAAT 
AGGATACCTT AACAAGTTGA ATATTACT6T 
CAGCTTTGAA ACAGAATTAG ATATTAAAGG 
AAGAGCCCCA CATATTGAAA AAGTAGGTCA 
GAAAATTGTA GCTGTTCAGC AAGGTAAATA 
AGATGACTAT AGAGTAACTG ATTACTTTAT 

30 

TGTATGCTAA ATCAACGAAT TATTGATATT 
TCAAACTTAG CTTTGGAGGA GTTATTTTTT 
GCTATACATA AGAAAAAAAC CCTTCAAAGA 

35 

TAATTCGATG TTGATGTATT TGTTAAATAA 
ATACTAGTGT tGCACCGAAT AATAATTTCA 

^ TGTCATTAAG TGATTTAATC GCACCTGAAA 
ATACTAAGAA TACAGATGTA ACACCTTTTG 
GTGCTTGCAT TGCTACAAAT TCGTTAGATA 

^ GAACTGCATC TTGCCATGGC ACACCGACTA 
TTAATGTTTG GAAATCCCAA GAAATAGCGC 
CAATTCCATT TAATAGAGCG ATAATGGCAA 

SO 

CAGCTACTTT AAATCCATCT AAAATATATT 
TTTCTTCAGT TTCTTCAACT AATAATTTGT 

55 



TGTAATGAAT GATGATGAGA TTATGACTTT 27060 

TTTAAAACAA ATTAAAGACA ATGGTCX3TTT 27120 

TGCGACTCCT CAAGATGCTG CTTTAATGAT 27180 

ATCAGGTATT TTTAAATCAG AAGATCCAGA 27240 

AACACATTAC CAAGACTATG AACTAATTGG 27300 

GAAAGGTTTA 6ATATCAATC AATTATCATT 27360 

AGATATGAAA ATAGGTGTAT TAGCATTACA 27420 

TGAATTAAGT GGTCATGAAG GTATTGGAGT 27480 

GGGCTTAATA TTACCTGGTG GCGAGTCTAC 27540 

ATTTAAAGAG GCTTTACAAA ATTCAACTTT 27600 

AGTTCTAGCG CAAGATATAG TTGGTGAAGA 27660 

ACAACGAAAC TCATTCX3GTA GACAAGTTGA 27720 

TATCGCTACA GATATTGAAG GTGTCTTTAT 27780 

AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27840 

TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

TAATCATATT GTAAAaAAAG CATAGCTTAA 27960 

TATAGATTTG TTGAGAAGAA AATATCTCCT 28020 

ATGTCAAAAT TAAAAATGAT AAAAAATAAA 28080 

GACTGAGAAT AGTCAAAATT TTGAAGGGGT 28140 

AGAATCcAGC GATTGCAGCT GAAATGAAAG 28200 

AACCAAAGCG GGCAACTGTA TCTCCTTTTT 28260 

TAATACCGAT AGAGCTAAAG TTAGCAAATG 28320 

CGTGTTCAGA TAAATCACTA AGTTTACCAA 28380 

ATAGTTTTGT CGCCATAACT GAACCGGCTT 28440 

AGAATGCAAA TGGTGCAAAG ACAAAACCAA 28500 

CACCTGAAAC TGTACTAAAG ATATTGCTTA 28560 

TGTATCCX3AT TAACATTGCG CCTACAATGA 28620 

CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 28580 

CATCTTCTTC ATTAACTTTA TAAGGGTTAA 28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTC5AAGC AGAAACAGTC GACATTGCTG 28860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 28920 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 28 980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 29040 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

AGAAGAATGG TGGTTGCTTA GQATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 29160 

CATCTGCTGC TTTTAATAAT AAGTAGTTAA AACCX3TTTGA AATACCACCA ATAACCTTGA 29220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 29280 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC CCAATTAGAA TATGCATATA TTTCTCATTC CTTTAGTTTT 29400 

TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 29460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGTCAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 29580 

TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29640 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 29820 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAQC 29880 

TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 29940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30000 

ATGAOGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 30120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 30180 

AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 

TCAA6A 30246 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

TATTCCCCCA TCGGTTTATT AAATCGTCCA TTTCAATACT GTTTTTCCCC AAGATGTCGA 60 

TAAATCCATT TCAAACGCTT GGACGATATC TTGCATCGTA CATACATTAA TTTCATGTCC 120 

TTTTAATAAT GCTAACTTTT CAACTATGTC TGGGTACTTA CGATATAAAT CAACAACTTG 180 

CTCAAAATCT TTAGAGCCGC TTCGACTACT ACCAATCAAC GTTAATCCTT TTTCAAGTAC 240 

TAATCGTGTA TTCACTTCCA CGGGTAATTC ACTTACGCCT AACAAAGCAA TACTGCCTTC 300 

TGGTGAAATA TGTTCAACTA TTTGTTGAAG TGCAACTTGA CTTCCTTTAC CTCCAACACA 360 

TTCAAATGCA TGATCAATTT TAAGATCATC TGGTATTTQA TTTACTGTAA AGATGTCATC 420 

TACAAATGAA AAATGACTTA ATTTATAGTC TGTCTTACCA AATACATAAG TTTTAGCTrC 480 

TGGGTACAAC TTACGTAGCA AAATAGCAGT AATATAACCT AAGTTACCAT CACCCCAAAT 540 

20 ACCAAAGCTG GTTTTCAAAG GTATAGATTT ACGTTCAAAT CGTTGTATAG CATGATAACT 600 

TACTGACACT AACTCTGTGT ATGAAATCGT ACTCAAATCA ATGTCATTAG GCAGCGGAAC 660 

GATACGATCA TGTGCCATCA CAACGTAGTC TTGCATAAAA CCATCATAAC CACTAGATCT 720 

AAAATAACTA GAGGCTAAGT AATTCTCCGC AATAATATGA TGTTGCTCTG TAGGTGTATT 780 

CGGTACCATT ACTACTTTCG TACCTTTTTC AAATACCCCT TTACTATCAA ATACAACTTC 840 

ACCAACAGCT TCATGAACTA ATGACATTGG TAATTTTTTG CGTAGTACAT TTTCATCTCT 900 

TCGACCTGTG TAATACCTTT GATCAGCTGC ACAAATAGAC AAGTATAAAG GTCTTACGAT 960 

GACATGATTA CCATAAATAT CAACATTATT ATATGTGAOG TCGAACTGTC TCGGTGCAAC 1020 

GAGTTGATAT ACTTGATTAA TCATCGGCAA TATCACCTTG AATAATGGCA TTTGCTACTT 1080 

TTAAATCATA CGGTGTTGTC ACTTTAATGT TGTATAGTTC TCCaCGTACC AATTTAACTG 1140 

CAT5TCCAGA TTCGACAATG ATTTTACATG CATCTGATAA GATTTCTTTT TGTTCACTAC 1200 

40 TTAAGGCGCG ATAACTATCT TGTAATAATT TAATATTAAA TGATTGTGGT GTTTGGCCTT 1260 

GATACATTTC ATTCCTTACA GGGATACTGT GTATGTTCTG TTTATCTTTA GACATTACAA 1320 

TCGTATCAAT TGCTTCAATG ACTGTATCTA CTGCACCATA TTTTGCTGCT ACTTCAATGT 1380 

TCTCTTTAAT AATACGTTGA GTTAAAAATG GTCTTACGGC ATCATGAGTT ACAATCACAT 1440 

CATCATTATT AATTCCATTT ACATTGCGAA TATGGTCGAT AATGTTCATA ATTGTTTCAT 1500 

TTCGATCCGT ACCACCTGCA ACTACTTTGA CACGTTGATC TGTAATGTTA TATTTTTTTA 1560 

AAATATCCTG TGTATGGGAA ATCCACTGTG CTGGCGTTGC GATAATAATC TCATTAAATT 1620 

CACTCACTAA AATGAACTTC TCAATTGTAT GGATTAAAAT CGGTTTATTA TCAATATCTA 1680 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1800 
. ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA 1860 

5 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 
AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 1980 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 2040 

10 

ATTTATCTTA TTAAGTCGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGOGTT TTOyrCCTCA AAGGTGCATA 2280 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2340 

20 TGTCTCCX3GG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

TTTCGTCTTC ATATAATGTA AGGTTGCOGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 2580 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

30 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATQTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2820 

^ CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 2880 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACAlTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

40 GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3050 

AGGTATTGCT GGGCAACCTG TCACnTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCX3CTCGTA CTTTAATAAT 3240 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

SO 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACX3 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 3480 
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TGCTTTAATA CCTTCGCCGG ATTTTAAATG TTGATACGCC TCGTCCCATT TCGAAATATC 3600 

ATATATTTTT GTCACCAAAG CTTCAGCATT TACTAAACCA TCCGCCATAA GTTGCAATGA 3660 

AGGTTCCCAA TCTGCTGGCT TTTGACTTCT ACTACCAACA ACTGTTATTT CTTTTTGAAT 3720 

CACTTTTTCC ATATCAAATG GAATTTCAGC ATCCTTAAAA ATACCTATTT GACTGTAGAA 3780 

ACCTTTTTTG CGTAAAATAT CCAAACCTTG TCGTGCTGCT GGAACTGCAC CTGAACATTC 3840 

AACAACAACA TCTGCACCGT AACCGTCTGT AATTCCATTG ATATACGTTT TTAAGTCTGT 3900 

TTGTTGTAAA TTGACTACAT AATCCATGTG CAATGCTTCT GCTTTATCTA ATCTGACTTT 3960 

GTCATTGTCC AATCCAGTTA CCACAACAGT TGCGCCTTTA CTTTTTAACA CTTGTGCTAC 4020 

AAGTAATCCG ATTGGCCCAG GTCCCATTAC AACTGCTACA TCGCCTGAAT TGACTTGAAT 4080 

CTTAGAAACG CCATGATGTG CACATGCTAA TGGTTCTGTC ATAGCTGCAG ACTGATACGA 4140 

20 TAtTCGTCTG GAATATGATG CAAACTTTCT TCACGTGCAA TGACATAATT AGTAAATGCG 4200 

CCATCAACTT GTGTTCCAAT ACCTTTTCGA TGGTT6CATA AATTATAGTC TTTTGATTTA 4260 

CAGTATTCAC ACTCATTACA AACATAGAAT GTCGTTTCAO aTGtGACACG GTCACCAACT 4320 

25 TTAAAATCTT TAACGTCTGC TCCAACTTCA ACGATTTCAC CAGAAAATTC ATGACCTAAT 43 80 

GTCACTGGAA AATTAACTTT ATAATGACCT TCATAAGTAT GAATATCTGT GCCACAAATT 4440 

CCTGCATAAT GTACTTTAAT CTTTACTTTA TCATCTAGCG GTGTTGCAAC TTCTTTATCA 4500 

AGAAGTTCTA AGTTGCCATG TCCTTCTCTT GTTTTTACTA AAGCTTTCAC CACAAACACC 4560 

TCGATTTTTA ATTGAATAGA CTAAATAGTT TAAAGATAAG ATAGTTAACG ATATTACCAC 4620 

CTTGATCAAT ACTTGAAATT TCAGATGAAC CTTTTGGCAT TTGTACATTC GTACCTTTCG 4680 

CCATATCTGT GAAAATGGGT GCTACGTCTG TTGCAATATA TAGTGAAATT GCAATCATAA 4740 

TCGTACCCAC AATGACAGAA TGAATAATGT TTCCTCTTGC TGCACCAACA ATAAACGCGA 4800 

CAACAAATGG TATCGTTGCT AAGTCACCAA AAGGTAGTAC TTGGTTTCCT GGTAAAATAA 4860 

CGGCTAATAA AACAGTGATA GGTACTAAAA TTAATGCTGT CGAAATAACT GCTGGATGAC 4920 

CTAATGCTAC AGCCGCATCC AATCCAATAT AAATTTCACQ TTCGCCAAAA CGTTTATTTA 4980 

45 GCCATGTTCT TGCAGACTCT GAAACTGGCA TTAAACCTTC CATTAAGATT TTTACCATTC 5040 

TAGGCATTAA TACCATTACT GCAGCCATTG ACATTCCTAA ATTAATGATG TCTCCAGGTT 5100 

TGTAACCTGC TAACACACCA ATACCTAAAC CTAAAATTAA GCCGACAAAT ATAGACTCTC 5160 

CAAATGCGCC AAAACGTTTT TGAATTGTTT CAGGATCAGC ATCTAACTTA TTCAGACCGG 5220 

GTACTTTTTG TAACAATTTA ACTAAGTAAA TACCTGGTGC ATAAGAAATT GTACTTCCTG 5280 
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CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 54 00 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAATATCTA 5460 

5 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 5520 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 5580 

CATCAATCAC ATTCAGACTG ACGCCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 5640 

10 

AATTTTTAAC TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 5700 

AACCAGACCT AAATGCCGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

,5 CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 5820 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CXy^TACTTTC OGATTGTGAT 5880 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCCTTGCT AAATTCGACT GCAGTCCTTT 5940 

20 TGAATTACAG tCACTGOGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AGTCATTTAT 6000 

ATGCAATTTG CATATATTAA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT 6060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 6120 

25 

CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTGTCA TTGCAGTTGT AACTAATAAA 6180 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 6240 

ATATTGTGTT CCTTTGCCAT TTCTTCAATT GCATTATTTA CTACTGTTGA CGTTGCAATA 6300 

30 

CCTGCACCAC ACGCTACTAA TACTTGTTTC ATTTTCAATT CCTCCAATTA ATTTTTAGTT 6360 

ATATTCCAAA TAATCATTGA TTAGTGTTXSC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 6420 

CTGCTCCAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 6480 

35 

ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 6540 

TGTTCCCATT TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6600 

40 GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA CATAGATGCG TTGGTAAACC 6660 

AGTAGCAAAT TCTTTTTCTC TGTCGATGAC TGCATCTTTA AACGTTGACT TCACGAACCC 6720 

ATTTTGAAAT AACACATCTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGCCGACAA 6780 

^ ATTGAGCATT ATATTTTCTT TATGCACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 6840 

TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6900 

CGCTTAATTG CGTCATTGTC TTGCGCCACA TCTCTCAATT GTAGTAACGC TCTTAAGTGT 6960 

50 

GTCACTTTAT CAACAGCAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 7020 

TATATTGGTT CTTGTAATAT CAACATACTC ATOGCTGTTT TATGTACATG CTTTTCAGAG 7080 
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TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 7200 

TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT 7260 

5 

ATCAAGTCAG CTAAATCTAA TGyCTwATTT tGTGtGACaA TCGATCCATT AATGGTTGAA 7320 

ATTGAATTAT AATTGGCAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 7380 

CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCTTATCA ATGTCAGCCG 7440 

ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCGTTGCTCA TTTTCAGATA 7500 

AAGCTACTTT TGTGATAAAT AATTTTTTAT GTGTTAGGAC AAACATTGGT GAAAAGACGA 7560 

IS TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 7620 

TTTCTGGAAA TAAGTTTCOC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTCTG 7680 

TACACACGAT AATAGCTTTT ATCTTGCCAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 7740 

CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG CAACGATTTT CCGAAATATT 7800 

CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 7860 

AAGGATTTAT TAATTCATCA CGATCCGTTA AGTTATATTT AATCCTATAA AAAGCAGGCG 7920 

25 

TTAAATGTAA CAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA ATAAAAGTGA 7980 

TTTGTTCAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT TAAATTCGAT ATGCTATCTG 8040 

ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 8100 

30 

TTTCCGCTTC TGGCAAATCC GGCTCATGTT GCGTCATAAT CTCCGTTGCT TGATATTCTT 8160 

TCGTATCCCT CAAATACTGA TAATTAATAT TTAATGGATT CATCACATGA CCACTTTGAA 8220 

TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC 8280 

35 

GACTCTTCAA AAATTGTTCT ACCTGTTTGA TCTTGTCTTT TTGATATGCG ATATCTTCGA 8340 

ATCfflAAGTT GAGCGCCTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTITGAT 8400 

40 CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACAATT TCATAACCAT 8460 

GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 8520 

TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT TGAAAAATGG TTTAGAGACA 8580 

TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 8640 

AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 8700 

TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 8760 

SO 

CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 8820 

CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TCTCTGAGTA 8680 
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TCAATCGTCA CACCGATGTA CACACTTTGA ACACATATTT TCAAAATGAG CATGTACATC 9000 

ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 

AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATTT 9120 

TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 

CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT ATCTATTTAA TTAAGTATTA 9240 

TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 

TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TTTAAATCGA CAGCCATAGT 9360 

GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 9420 

TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATCCATATT 9480 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTGCAATAAT TCTACGACGA ACTGTGCAAC 9540 

AGGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 9600 

TACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCT6CTTC 9660 

AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 

CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 9780 

TACATTCGTA AGAGGACCTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 9840 

GTTTATAATC ACATCTGATG CTGGCATTGC AACTGCTTGA CGTGATGGTG TCGACGGTAG 9900 

TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9960 

TAACX3GCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 10020 

6TTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10080 

TAATATCTCT AGTGGACTGT CAATTGCCCC CGCTAAAATT AATGCTATTG CATCATCGTG 10140 

TCCTDGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT ATATCCACCT TTCTTAAGTT 10200 

GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 10260 

TTTTAACTTT CTCATATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 10320 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 10380 

GACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATTAAATTTC 10500 

TCAAAAGCTA TATTGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10560 

ACAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 10680 
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AAATTTACCG GTCATCTTTG CAATTGGTGT CGCAATCGGA TTATCTAGAA GCGATAAAGG 10800 

TACTGCAGGT tTAGctGCGC TGCTCGGTTT CTTAATTATG AACGCAACTA TGAATGGCTT 10860 

ATTAACTATC ACGGGCACAT TGGCAAAAGA TCAGCTTGCA CAAAATGGAC AAGGCATGGT 10920 

GCTCGGTATA CAAACGGTTG AAACCGGTGT TTTTGGCGGG ATTATCACAG GTATTATGAC 10980 

CGCMTACTT CACAACAAAT ATCACAAAGT GGTATTACCA CCGTATTTAG GTTTCTTTGG 11040 

TGGCTCTAGA TTTGTCCCTA TTGTCACAGC ATTTGCCGCA ATCTTTTTAG GTGTATTGAT 11100 

GTTTTTCATT TGGCCAAGCA TACAAGCCGG CATTTATCAT GTTGGTGGAT TTGTAACGAA 11160 

AACAGGTGCC ATCGGTACTT TTGTTTATGG CTTCATCTTA AGATTGTTAG GTCCACTCGG 11220 

TTTACACCAT ATTTTTTACT TACCX5TTTTG GCAGACGGCA CTTX5GTGGTA CTTTAGAAGT 11280 

CAAAGGGCAC TTAGTTCAAG GTACOCAGAA CATCTTCTTT GCTCAACTTG GTGATCCAGA 11340 

TGTGACGAAG TATTATTCAG GTGTGTCACG CTTTATGTCA GGCCGTTTTA TTACGATGAT 11400 

GTTCGGCTTA TGTGGTGCCG CACTTGCAAT TTATCACACA GCTAAACCTG AACATAAAAA 11460 

AGTTGTCGGC GGTTTAATGT TATCCGCTGC ACTCACTTCA TTTTTAACAG GTATTACCGA 11520 

ACCTTTAGAG TTTAGTTTCT TGTTTGTCGC ACCTATTCTT TATGTAATCC ATGCCTTCTT 11580 

TGATGGATTA GCATTTATGA TGGCAGACAT TTTCAACATT ACAATTGGTC AAACCTTCAG 11640 

TGGAGGCTTT ATCGATTTCT TACTCTTTGG TGTGCTACAA GGTAATAGTA AAACAAACTA 11700 

CCTATACGTC ATACCTATTG GAATTGTGTG GTTCTGTTTG TATTACATCG TTTTCAGATT 11760 

CTTAATTACG AAATTTAATT TCAAA?lCACC TGGTCGAGAA GATAAAGCTG CAGCACAACA 11820 

AGTTGAGGCT ACTGAAAGAG CACAAACTAT TGTTGCTGGT TTGGGAGGCA AAGATAACAT 11880 

TGAAATCGTT GACTGTTGTG CAACX3AGACT ACGCGTCACA CTTCATCAAA ATGACAAAGT 11940 

CGATAAAGTA TTACTCGAAA GTACTGGTGC CAAAGGTGTA ATCCAGCAAG GCACTGGTGT 12000 

GCAAGTAATT TATGGGCCTC ACGTTACAGT TATCAAAAAT 6AAATTGAAG AATTGCTCXK5 12060 

GGATTAAGAC TAACCGAAAT ATCAACAGAA CTAATGGCAA CGATGTACGA AGTAAGAAGT 12120 

GACATCGTTG CTTTTATTTT TAATGTTACA TTTGAAGCAT TAAGTTCATC ATGCACTGTA 12180 

GTGAGCCCGC AAATCGCCTC TGCTAGACAA TCATCTTAAT GCTATGATTA AAGCTTAAGT 12240 

GCCAGATTTG AATTTAATTT CAACAACGAC TTTCACTACA TTAAAAATAG GGCCACTCGA 12300 

CACATATAGT TGTATCAAAT AGCCCTTTAT ACAATTTTTT GGGTAAGGTT TTACAATTTT 12360 

TGGGATGGTA TAGATTTTAT AAAAAGTTAT TTAAGTTCTT CTGCTTCAGC CATAATATCT 12420 

TITAATGTTT TAGCTGAATG TGCGAACTTG CTTTGTTCTT CGTCGTTTAA TCGGATTTCT 124 80 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12660 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTT6AC CCGCAATATT AGCGTGTGAC 12780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTACGTGGC 12840 

GCAACATCGa AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12960 

CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13020 

CAATTGATTT GAATATTTTC AAGTTTTTA6 ATACTAAATC TAAACGTGTT TCTCCAGGTT 13080 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 13140 

CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT ACCTATTAAT ACAACTTTGT 13320 

TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 80 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13680 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 

GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

ATTACXSTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 14280 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

GGTATTTTnG CSAnGGGTACC TAAAGCAATT CCGGCAAAGG GTnAATCCAG GTACCGAAAT €0 

IS GGACTTCCCG TTATCGATAA TACCGACATA TATTGTGACA AGTAGATTTT ATGGACATTT 120 

AGGCTTACTT TTACTTGTGA TAATTGCATG TATGTTTACT GGTATTTAtC CaTCaATACA 180 

TATCATTCAA TTATTGATAT ATGTACCGTT TTGTTTTTTC TTAACTGCCt CGGTGACGTT 240 

20 ATTAACATCA ACACTCGGTG TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 300 

AAGAATATTA TTTTACTTTT CACCAATTTT GTGGCTACCA AAGAACCATG GTATCAGTGG 360 

TTTAATTCAT GAAATGATGA AATATAATCC AGTTTACTTT ATTGCTGAAT CATACCGTGC 420 

AGCAATTTTA TATCACGAAT GGTATTTCAT GGATCATTGG AAATTAATGT TATACAATTT 4 80 

CGGTATTGTT GCCATTTTCT TTGCAATTGG TGCGTACTTA CACATGAAAT ATAGAGATCA 540 

ATTTGCAGAC TTCTTGTAAT ATATTTATAT GACGAAACCC CGCTAACCAT TAATAAATGG 600 

AAGTGGQGTT CATTTTTGTT TATAATTTAA GTAAATAACA TATTAAGTTG GTGTATTATG 660 

AACGTTTTAA TAAAGAAATT TTATCATTTG GTAGTTCGAA TACTTTCTAA AATGATTACG 720 

CCTCAAGTGA TTGATAAACC GCATATCGTA TTTATGATGA CTTTTCCAGA AGATATTAAG 780 

CCTATCATCA AAGCATTAAA TAATTCGTCG TATCAGAAAA CTGTTTTAAC AACACCAAAA 840 

CAAGCGCCTT ATTTATCTGA ACTTAGCGAC GATGTTGATG TGATAGAAAT GACTAATCGA 900 

40 ACATTGGTAA AACAAATTAA GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 960 

TACCTATTGC TAGGTGGATA TAATAAGACT TCTAATCAAC ACATTGTTCA AACGTGGCAT 1020 

GCAAGTGGTG CATTAAAAAA CTTTGGCTTA ACAGATCATC AAGTCGATGT GTCTGACAAG 1080 

GCAATGGTTC AGCAGTACCG TAAAGTTTAT CAAGCGACGG ATTTTTACTT AGTGGGTTGT 1140 

GAACAAATGT CACAATGTTT TAAACAGTCT TTAGGTGCAA CAGAAGAGCA AATGCTGTAT 1200 

TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 1260 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 1320 

GATAAAGCAG ATAATAGGGC TATTQATAAA GCTTATTTTG AAAAATGTTT ACCAGGATAT 1380 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAGCTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

^ TATGATCA6G TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 1680 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 174 0 

10 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 1800 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 1860 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

IS 

ATATTTCATT AATACAATTA CAAAATGCTT CXSATA6CTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 2040 

CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGT6 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 2220 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTGaCG 2280 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 2340 

ATATACATTT AGTITCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 2400 

GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 2460 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2580 

35 

TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 264 0 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 2760 

40 

TGATACAACT TAAAGGGA6A AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2880 

45 AATGAAACGT 6TAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 2940 

GCTTCGTCGT GCAAGA6AGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 3000 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3060 

SO ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA CAAGAATTAT ATGGTAAAGA TGCTAAATAA ATTATATAGA ACTATCGATA 3300 

CTAAACGATA AATTAACTTA GGTTATTATA AAATAAATAT AAAACGGACA AGTTTCGCAG 3360 

5 CTTTATAATG TGCAACTTGT CCGimTAG TATGTTTTAT TTTCTTTTTC TAAATAAACG 3420 

ATTGATTATC ATATGAACAA TAAGTGCTAA TCCAGCGACA AGGCATGTAC CACCAATGAT 3480 

AGTGAATAAT GGATGTTCTT CCCACATACT TTTAGCAACA GTATTTGCCT TTTGAATAAT 354 0 

10 

TGGCTGATGA ACTTCTACAG TTGGAGGTCC ATAATCTTTA TTAATAAATT CTCTTGGATA 3600 

GTCCGCGTGT ACTTTACCAT CTTCGACTAC AAGTTTATAA TCTTTTTTAC TAAAATCACT 3660 

TGGTAAAACA TCGTAAAGAT CATTTTCAAC ATAATATTTC TTACCATTTA TCCTTTGCTC 3720 

75 

ACCTTTAGAC AATATTTTTA CATATTTATA CTGATCAAAT GAGCGTTCCA TTAATGCATT 3780 

CCCCATCATA TTACGTTGCT TCTCGCCACC AAGGTTTTTA TAGTCTCCTO CACCCATOAT 3840 

AACTTGATTA ATTCTAAATT TACCTCGTTT GGTAGTAATC 6TATGGTTGT AATTTGCTGT 3900 

ATCACTTGAT CCAGTTTTTA AACCATCTGT ACCCGGCAAA CTCATTTTTG CACCTTCCAA 3960 

TGAAAAGTTG AATGTGTAAT ACGTAACTGC ATGCGTTGTT GGTGCTAACT GCTTTGTAAA 4020 

25 GTCTAATATT TTAGGTGTCT CTTTAATCAC GTGTAAATCT AAAATGGCAT AGTCTCTAGC 4080 

AGTCGTTACA GTACGTTCTT GGTCTTTATA CTTTGTTGGT GCAAATGTAC GTAATCTTGA 4140 

ATTTTCAGCA CCCGTTGGAT TGACGAAATG TGTATTTTTC ATTCCGATAG CTTTAGCTTT 4200 

30 GTTATTCATT AAATCAACGA AATCGCTGGT GTT HT1X3 AA ACCTTCTTAG CTAAAATTAA 4260 

TGCCGCGGCA TTACTAGAAT TAGATACTGT AATTTGTAAT AGGTCTGCXJA TTGTCCATAC 4320 

TTGTCCAGGA TATAGTTTCG TATTACTCAA CTCAGGTAGT GTAGACATAA TATATTCTTT 4380 

35 

GTTCGTCATT GTGACTGTGT CATCAAGTGA AAGCTGCCCC TTATTTACAG CTTCCAATGT 4440 

TAAGTACATT GTCATTAATT TAGTCATAGA CGCTGGAtTC CACTTAGTAT CGATATTGTA 4500 

TTGATACAGT AATTGTCCAG TTTGACTTAC ATTAACAGCA CTCGTCGGTT CGTATGCAGC 4560 

40 

CGACAAACCT GCATAACCAT ATTGATTTGC TGCTTGTACA GGGGTTACGT CACTGTTAGT 4620 

AGCTTGTGCA TATGGTGTCA TAATACTTAA TGTTAAACAT AAAATGATGA TAATAGATAT 4680 

TAAATnrrC ATAAAGCGTT AATCTTCCCT TTTCCAATTC TTAAATATTC CCTAAAAQCA 4740 

ATGGTTATTC CTACTTACGG AAATCATTGC TAATTCACTT CACCTTAATT AAATTGTTGA 4800 

AAATAAAGTT TTCTGCAGTT AATTTGAAAA ATAATGCAAA TATATTACGT GTGTAGCTAA 4860 

SO AGGTGTTATA ATGTTTGTAC GAAGAGCAAA CTTACTCAAA AGCGATTAAT TTTCATGTTT 4920 

TAATATAAAG ACTTTGAGAA GTTATTACAA AAAATGCAAT AGAAATATTC TATCATATAA 4980 
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AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTTCT TATTTAAAAA ACTATCATGG 5160 

CCAGTGGGTC TTATCGTTGC AGCTATCACT ATTTCATCAC TAGGGAGCTT AAGTGGACTA 5220 

TTAGTGCCAC TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TATCAATTGG 5280 

AATCtAATCG CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 5340 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG ATTATTTATG CGATAC3GCTC AGTTTTATGG 5400 

GAGCATATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 5460 

AGTCX3ATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTGACAAAA GCTACCTtnAC 5520 

TTATTACCAT CAATCGTTAC ATtAGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 5580 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTC6 TTTTaATTAT GATTCCTCTA 5640 

20 GGTCGTATTA TGCAAAAGAT ATCX5ACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 5700 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 5760 

GAATTAGATA ATGCACATAA AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 5820 

25 AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTGCTAAC AATTGCAATT 5880 

ATTTTAGGTT TTGGTGCATT AGAAATTGCG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 6000 

ACAGATTATA AAAAGGCAGT CGGTGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 6060 

ATTGAACCGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 6120 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 6180 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTOGQTCTGG TAAAAGTACG 6240 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 6300 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 6360 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 6420 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTTATCATG 64 80 

45 CAATTTGATG AAGGATATGA CACGCTTGTA GGTGAACGAG GATTGAAACT GTCTGGCGGA 6540 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 6600 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

^ ACATTGATGG AAGGTAQAAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6720 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCATTCAGAA 6780 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 


6900 




TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC 


6960 


5 


ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT 


TAAGAAAAAA 


7020 




ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA TATTAAATAC 


TGTAGCCACG 


7080 




AGTCATAATT CTTCATATTT TACATAGCAA TTTAACTGAT TTTAGAGTCC 


ACGGTACAGA 


7140 


10 


AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG 


GGTGCCAAAT 


7200 




GTmTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTGCGG 


TATTATTTTC 


7260 


IS 


AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 


7320 


TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCAAAAA GCAGTAAGAG GAGCAGCCAA 


7380 




TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTG6TACA GGTTTTGCAT 


TTGCAAGTTT 


7440 


20 


GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT 


TATTAATAGT 


7500 




GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG GAGGTATTGG 


7560 




TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCX3AG TCATTCTTTG 


GGATAGAAAT 


7620 


25 


GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA 


AACGTATGAA 


7680 




TGAAATGCGT GTATTAACAA TCGCAATGAT GTCAATGAGC TCTGTATCGG 


GAGCTATTGT 


7740 




AGGTGCGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA ACGGCAATTC 


CACTAAATAT 


7800 


30 


CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG 


AGAAAGAAGA 


7860 




TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA CCATTCTTCT 


CATTCCTTGG 


7920 




AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG 


TTATTAGTTT 


7980 


35 


TGTAGCGTTA GCTGATCTAT TTGATCGTTT TATCAATTTG ATTACAGGAT 


TGATAGCAGG 


8040 




ATGGXTAGGC ataaaaggta gtttcggttt aaaccaaatt ttaggtgtgt 


TTATGTATCC 


8100 


40 


ATTTGCGCTA TTACTCGGTT TACCTTAT6A TGAAGOGTGG TTGGTAGCAC AACAAATGGC 


8160 


TAAGAAAATT GTTACAAATG AATTTGTTGT TATGGGTGAA ATTTCTAAAG atattgcatc 


8220 




TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTC 


CAAACTTCTC 


8280 


45 


aacgattggt atgattatcg gtacattgaa aggcattgtt gataaaaaga catcagactt 


8340 




TGTATCTAAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT 


TAITAACAGC 


8400 




agctttcgtt ggtttatttg catggtaata tgtcgaagag tgactatgat aatacatttt 


8460 


SO 


AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT 


TGGACTTTTT 


8520 




TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA GCATTATAAA 


8580 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8700 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 8760 

^ ATGAAAGTAA ATTAAAAAT 3779 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

IS 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GTTGCAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 60 

20 GCGTAAGTTA aCGGATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 120 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 180 

ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT ATCAATATTT 240 

25 CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 300 

AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA ACCATTATAA AAAATGGAAA 360 

AGCAGAATTA TTAGCGCCAA TGAGTGCTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 420 

30 

TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 480 

TGTGGATATA CCTGGTAGTA CATATGTGAT TTTCGGTGGT GGAGTAGCAG CAACAAATGC 540 

AGCAAATGTT 6CCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 600 

35 

CATTAAATAT CTTGAAGATA TGTATGCAGA AAAAGATGTC ACAGTAGTCA AATCAACACC 660 

AGAi^TTTA GCAGAACAAA TTAAGAAAGC AGATGIATTT ATTTCTACAA TTTTAATTTC 720 

^ AGGTGCGAAA CCGCCAAAAT TGGTTACTCG TGAGATGGTT AAATCAATGA AAAAAGGTTC 780 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 840 

AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 900 

45 AGGAGCAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATT6 ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 

TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA 6CTTCATCAC ATGACCTAGA 1080 

SO TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 1140 

GAATATTTTA AATATAGCAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 1320 

TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 1380 

5 TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATnAAAA TTAATCACTT AACAGATGAA 1440 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAATG CCTGAAACAG CACCACAAGC 1560 

10 

GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 1620 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 1680 

ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 1740 

IS 

TATTTGGAAT GT6AATACAG TCATCGTACC AGTTGGCGGT GGAGQATTAA TTGCAGGTAT 1800 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA 1660 

TGTTCATG6T ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 1920 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAACATATC AAGTAGTTAA 1980 

ACATTTAGTA GAT6AATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 2040 

2S TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTGCAAT 2100 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 2160 

AGGCGGGAAT GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 2220 

30 AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 2280 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGG T TT CT TC TCAGCGTTTG CTATTGTTAT 2340 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA AACGTAACAG AAGTAACAGG 2400 

35 

AACAGCAGGA ATGGCCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA TTTGTGCGGG 2460 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGGCTTAA CXy^TATAT 2520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATCAGGT TGGGCGCAAT CATTTATTTA 2580 

40 

TTTTCXIAGCT AACGTAGCAG CATTGTCTAT OGTATTTGCG ACACAGCTAA TTAATTTATT 2640 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2700 

^ GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2760 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2820 

TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2880 

SO TTTATTAGCA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG TTGCGGGGGA 2940 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AGCGATTTCA GTTGGTATCG GTTGTATTAT 3000 
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TGGTAATTTA AATGCAGCTT CAGATACATC AAAAATATTA TTTGGTGAAA ATGGCGGTAA 3120 

GATTATTACA ATCGGTATAT TAATTTCTGT TTATGGTACG ATCAATGGCT ATACTATGAC 3180 

^ TGGTATGCXIC GTACCATATG CAATGGCTGA AAGAAAATTA TTGCCATTTA GCCATTTATT 3240 

CGCAAAATTA ACAAAATCTG GCGCACCATG GTTTGGCGCA ATTATACAAC TTATAATCGC 3300 

TATCATCATG ATGTCAATGG GAGCATTTGA TACAATTACA AATATGTTAA TCTTTGTTAT 3360 

70 

TTGGTTGTTC TATTGTATGT CATTTGTTGC GGTAATAATT TTAAGAAAAC GTGAACCAAA 3420 

TATGGAACGA CCATATAAAG TACCGTTATA TCCGATCATA CCTTTAATTG CTATTTTGGC 3480 

AGGATCATTT GTATTAATTA ATACACTGTT TACACAATTT ATATTAGCAA TCATTGGAAT 3540 

IS 

TCTAAXAACA GCACTTGGTA TACCAGTTTA TTACTATAAA AAGAAACAAA AAGCAGCATA 3600 

AGGTAAGATA ACTAGCATTG AGAATAAATG GATGGACTAC TAATAAATTT AAAGTTTTAC 3660 

2^ ACATTAAAAT CAAAAACCAT TCAATTATTC TATGGAACAG ACAAATTTCT GTTATGGAAT 3720 

TTGTCTGTTT TTCAAAAGTA TAGGGAGGCA AATAGAGATG GAAAAGCCGT CAAGAGAGGC 3780 

ATTTGAAGGC AATAATAAGT TGTTAATAGG AATTGTTCTA AGTGTAATAA CGTTTTGGCT 3840 

25 ATTTGCACAA TCATTGGTTA ATGTTGTACC AATACTTGAA GATAGTTTCA ATACAGATAT 3900 

TGGAACGGTT AATATCGCCG TTAGTATAAC TGCTTTATTT TCAGGAATGT TTGTAGTAGG 3960 

AGCAGGTGGT CTTGCTGATA AATATGGCAG AATTAAACTC ACGAACATTG GTATTATCTT 4020 

^ AAATATATTA GGTTCATTAT TAATCATTAT TTCAAATATT CCTTTATTAC TTATTATAGG 4080 

AAGATTAATT CAAGGACTTT CAGCAGCATG TATTATGCCT GCAACTTTGT CTATTATTAA 414 0 

GTCATATTAC ATTGGGAAAG ATAGACAACG CGCTTTAAGT TATTGGTCAA TTGGCTCATG 4200 

35 

GGGCGGCTCT GGTGTTTGTT CATTTTTTGG AGGTGCAGTT GCAACGCTTT TAGGTTGGCX3 4260 

TTGGATTTTC ATCCTATCAA TTATAATTTC ATTAATTGCA CTGTTTCTTA TTAAAGGCAC 4320 

ACCTGAAACT AAATCTAAAT CGATTTCTCT AAATAAATTT GACATTAAAG GTCTGGTTCT 4380 

40 

TTTAGTCATT ATGCTCCTCA GTTTAAATAT TTTAATTACT AAAGGATCAG AATTAGGTGT 4440 

AACCTCACTT CTTTTTATTA CTTTATTAGC TATTGCAATT GGATCTTTTA GTTTATTTAT 4500 

^ AGTTCTTGAA AAGCGTGCTA CAAATCCTTT AATCGATTTT AAATTATTTA AAAATAAAGC 4560 

TTACACAGGT GCAACAGCTT CAAACTTTTT GTTAAATGGT GTTGCAGGAA CATTAATAGT 4620 

AGCCAACACA TTTGTTCAAA GAGGTTTAGG ATATTCTTCA TTGCAAGCAG GAAGTTTATC 4680 

SO AATCACTTAT TTAGTAATGG TACTAATTAT GATTCGTGTT GGTGAAAAGT TACTTCAAAC 4740 

ACTCGGATGC AAGAAACCAA TGTTAATTGG AACAGGAGTT CTTATTGTCX3 GAGAATGTCT 4800 
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ATTCTTTGGT TTAGGACTAG GGATATATGC 


TACACCATCA 


ACAGATACAG 


CAATTGCAAA 


4920 




TGCACCGTTA GAAAAAGTAG GCGTTGCTGC 


AGGTATCTAT 


AAAATGGCTT 


CTGCATTAGG 


49B0 


5 


TGGAGCATTT GGCGTCGCAT TGAGTGGTGC 


AGTATATGCA 


ATCGTATCAA 


ATATGaCAAA 


5040 




CATTTATACA GGTGcAATGa TTGnCATTAT 


GGTTaAATGC 


AGGTATGGGa 


ATATTATCaT 


5100 


10 


TCGTTATCAT TTTGtTACTT GTGcCTAAAC 


mAAACGACAC 


TCAATTATGA 


TAATTGAGAA 


5160 


TTAAATTGAA ATCATACAAG TCGCTACAAT 


ATTAAACAAA 


AATATAAACC 


GATTCTTATG 


5220 




TGTCATTATT TTAAATGAAC ATAGGGATTG 


GTTTTTTATT 


ACTCTTTTAC 


GCTACTTTAT 


5280 


IS 


TTATAATTAT TATAAATTGT CACAAATTCA 


ATTTACCTTA 


CAATATATTT 


TGTGTTATTA 


5340 




TATTCTGGAG CATAAATAAA TTGTTCAACA 


CATAGTTGTA 


ATGTGTTTCA 


ATACTTTTTG 


5400 




GATAGATTGC GAAATTGTAT TGAATCGTCA 


TCGTTTTAAA 


TTTTTAAATG 


AGAATGGAAT 


5460 


20 


GAGCATTACA ATACACAAGC AATCAAAAGT 


AAATACATTC 


ACAACACAAC 


AGAGACATAA 


5520 




CAACAAGATA AGGAGTGAAC AATAGCT6TG 


AATTATCGTG 


ATAAAATTCA 


AAAGTTTAGT 


5580 




ATTCGTAAAT ATACAGTTGG TACATTTTCA 


ACTGTCATTG 


CGACATTGGT 


ATTTTTAGGA 


5640 


25 


TTCAATACAT CACAAGCACA TGCTGCTGAA 


ACAAATCAAC 


CAGCAAGCGT 


GGTTAAACAG 


5700 




AAACAACAAA GTAATAATGA ACAGACTGAG 


AATCGAGAAT 


CTCAAGTACA 


AAATTCTCAA 


5760 




AATTCACAAA ATGGTCAATC ATTATCTGCT 


ACTCATGAAA 


ATGAGCAACC 


AAATATTAGT 


5820 


30 


CAAGCTAATT TAGTAGATCA AAAAGTAGCG 


CAATCATCTA 


CTACTAATGA 


TGAACAACCA 


5880 




GCATCTCAAA ATGTAAATAC AAAGAAAGAT 


TCGGCAACGG 


CTGCGACAAC 


ACAACCAGAT 


5940 


35 


AAAGAACAAA GTAAGCATAA ACAAAACGAA 


AGTCAATCTG 


CTAATAAAAA 


TGGAAACGAC 


6000 


AATAGAGCGG CTCATGTAGA AAATCATQAA 


GCAAATGTAG 


TAACAGCTTC 


AGATTCATCT 


6060 




GATJ^TGGTA ACGTACAACA TGACCGAAAT 


GAATTACAAG 


CGTTTTTTGA 


TGCAAATTAT 


6120 


40 


CATGATTATC GCTTTATTGA CCGTGAAAAT 


GCAGATTCTG 


GCACATTTAA 


CTATGTAAAA 


6180 




GGCATTTTTG ATAAGATTAA TACGTTATTA 


GGCAGTAATG 


ATCCAATAAA 


CAATAAAGAC 


6240 




TTGCAACTTG CATACAAAGA ATTGGAACAA 


GCTGTTGCTT 


TAATTCGTAC 


AATGCCTCAA 


6300 


45 


CGTCAACAGA CTAGCCGACG TTCAAATAGA 


ATTCAAACX^C 


GTTCGGTTGA 


GTCAAGAGCT 


6360 




GCAGAGCCTA GATCAGTATC AGACTATCAA 


AATGCAAATT 


CATCATATTA 


TGTTGAAAAT 


6420 




GCTAATGATG GTTCGGGCTA TCCTGTTGGT 


ACATATATCa 


ATGCTTCTAG 


TAAAGGGGCG 


6480 


SO 


CCATATAATT TACCAACTAC ACCATGGAAT 


ACATTGAAGG 


CCTCTGACTC 


AAAGGAAATT 


6540 




GCTCTTATGA CAGCGAAACA AACTGGAGAC 


GGGTACCAAT 


GGGTTATTAA 


GTTTAATAAA 


6600 
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GTAGGAAGAA CTGACTTTGT AACAGTTAAT TCAGATGGAA CAAATGTACA ATGGAGTCAT 6720 

GGAGCAGGAG CAGGTGCAAA TAAACCACTT CAACAAATGT GGGAATATGG AGTAAATGAT 6780 

5 CCTCATCGTT CACATGACTT TAAAATAAGA AATAGAAGTG GCCAAGTAAT ATATGACTGG 6840 

CCAACTGTCC ATATTTATTC TTTAGAAGAT TTATCTAGAG CGAGTGATTA TTTTAGTGAA 6900 

GCTGGAGCGA CACCTGCTAC TAAAGCTTTT GGTAGACAAA ATTTTGAATA TATTAATGGT 6960 

CAAAAACCTG CTGAATCACC GGGTGTTCCT AAAGTTTATA CTTTCATCGG TCAAGGTGAT 7020 

GCAAGTTATA CAATTTCATT TAAAACACAA GGTCCAACTG TTAATAAATT GTACTATGCA 7080 

GCAGGTGGGC GTGCTTTAGA GTACAATCAA TTATTTATGT ACAGTCAACT ATACGTCGAA 7140 

75 

TCAACGCAAG ACCATCAACA ACGTCTTAAT GGTTTAAGAC AAGTGGTTAA TCGTACATAT 7200 

CGCATAGGTA CAACTAAACG TGTAGAAGTG AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 7260 

TTAGAAAGTA CAAACCTAAA TATAGATGAT TTTGTTGATG ATCCTTTAAG TTATGTTAAG 7320 

20 

ACGCCGAGTA ATAAAGTGTT AGGATTTTAT TCGAATAATG CAAATACTAA TGCTTTTAGA 7380 

CCGGGTGGAG CCCAACAATT AAATGAATAT CAATTAAGTC AATTATTTAC TGATCAAAAA 7440 

25 TTACAAGAAG CAGCAAGAAC TAGAAACCCA ATAAGATTAA TGATTGGTTT CGACTATCCT 7500 

GATGCTTATG GTAATAGTGA AcTTTAGTTC CTGTTAACTT AACGGTATTA CCTGAAATCC 7560 

AACATAATAt TaAATTCTTT AAAAATGACG ATACTCAAAA TATTGCTGAA AAACCATTTT 7620 

30 CAAAACAAGC TGGGCATCCA GTTTTCTATG TATATGCAGG TAACCAAGGG AATGCTTCCG 7680 

TGAATTTAGG TGGTAGCGTA ACATCTATTC AACCATTACG TATTAATTTA ACAAGTAATG 7740 

AGAATTTTAC AGATAAAGAT TGGCAAATTA CAGGTATTCC GCGTACATTA CACATTGAAA 7800 

ACTCGACAAA TAGACCTAAT AATGCCAGAG AACGCAATAT TGAACTTGTT GGTAACTTAT 7860 

TACC5GGGGA TTACTTTGGA ACGATACGTT TTGGACGTAA AGAACAATTA TTCGAAATTC 7920 

GTGTTAAACC ACATACACCA ACAATTACAA CGACAGCTGA GCAATTAAGA GGTACAGCAT 7980 

40 

TACAAAAAGT GCCTGTTAAT ATTTCGGGAA TACCGTTGGA TCCATCGGCA TTGGTTTATT 8040 

TAGTTGCACC AACAAATCAA ACTACGAATG GTGGTAGTGA GGCAGATCAA ATACCATCTG 8100 

GTTATACGAT ACTTGCGACT GGTACACCTG ATGGGGTGCA TAATACAATT ACTATACGAC 8160 

45 

CGCAAGATTA TGTTGTATTC ATACCACCTG TAGGTAAACA AATTAGAGCA GTAGTTTATT 8220 

ATAATAAAGT AGTTGCATCT AATATGAGTA ATGCTGTTAC TATTTTGCCA GATGACATTC 8280 

SO CACCAACAAT CAATAATCCT GTTGGAATAA ATGCCAAATA CTATCGAGGC GACGAAJcCAA 8340 

CTTTACAATG GGTGTCTCTG ATAGACATTC TGGTATAAAA AATACAACTA TTACGACATT 8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT 


TAACAGTGAT 


ATTACATTTA 


AAGTGTCAGC 


8520 




GACAGaCAAT 


GTCAATAATA 


CGACAAATGA 


TAGTCAATCT 


AAACATGTTT 


CAATTCATGT 


8580 


5 


AGGTAAAATT AGTGAAGATG 


CTCATCCGAT 


TGTATTAGGA 


AATACTGAGA 


AAGTTGTAGT 


8640 




AGTCAATCCG 


ACTGCTGTAT 


CTAATGATGA 


AAAGCAAAGC 


ATAATTACTG 


CCTTTATGAA 


8700 


10 


TAAAAACCAA 


AATATAAGAG 


GATATTTAGC 


ATCAACTGAT 


CCAGTAACTG 


TCGATAATAA 


8760 


TGGTAATGTC 


ACATTACATT 


ACCGTGATGG 


CTCATCGACA 


ACGCTTGATG 


CTACAAATGT 


8820 




GATGACATAC 


GAACCAGTTG 


TGAAACCTGA ATACCAAACT 


GTCAATGCTG 


CTAAAACAGC 


8880 


15 


AACGGTAACG ATTGCTAAAG GACAATCATT 


TAGTATTGGT 


GATATTAAAC AATATTTTAC 


8940 


TTTAAGTAAT 


GGACAACCTA 


TTCCAAGTGG 


CACATTTACA AATATTACAT 


CTGATAGAAC 


9000 




TATTCCAACT 


GCACAAGAAG 


TTAGTCAAAT 


GAACGCAGGC 


ACGCAGTTAT 


ACCATATAAC 


9060 


20 


T6CTACAAAT 


GCGTATCATA 


AAGATAGTGA 


AGACTTCTAT 


ATTAGTTTGA 


AAATCATCGA 


9120 




TGTGAAACAA 


CCAGAAGGCG ATCAACGTGT 


ATATCGTACA TCAACATATG 


ATTTAACTAC 


9160 




TGATGAAATC TCAAAAGTAA AACAAGCATT 


TATTAATGCA AATAGAGATG 


TAATTACGCT 


9240 


25 


TGCCGAAGGT 


GATATTTCAG 


TTACAAATAC 


ACCTAATGGT 


GCTAATGTAA 


GTACTATTAC 


9300 




AGTAAATATT 


AATAAAGGTC 


GATTAACGAA 


ATCATTCGCG 


TCAAACCTAG 


CTAATATGAA 


9360 




TTTCTTGCGT 


TGGGTTAATT 


TCCCACAAGA 


TTATACAGTG 


ACATGGACGA 


ATGCAAAAAT 


9420 


30 


TGCAAACAGA 


CCAACAGATG 


GTGGTTTATC 


ATGGTCTGAT 


GACCATAAAT 


CTTTAATTTA 


94B0 




TCGTTATGAT 


GCTACATTAG 


GTACTCAAAT 


TACGACGAAT 


GATATTTTAA 


CAATGTTAAA 


9540 


35 


AGCAACAACT 


ACAGTGCCTG 


GATTGCGAAA 


TAACATTACT 


GGTAATGAAA 


AATCACAAGC 


9600 


AGAAGCTGGC 


GGAAGACCTA 


ACTTTAGAAC 


GACTGGTTAT 


TCACAATCAA 


ATGCGACAAC 


9660 




TGATCGTCAA 


CGTCAATTTA 


CGTTGAATGG 


TCAAGTGATT 


CAAGTGTTAG ACATCATCAA 


9720 


40 


CCCTTCAAAC 


GGTTATGGTG 


GGCAACCTGT 


TACAAATTCA 


AATACTCX3TG 


CAAACCATAG 


9780 


TAACTCAACT 


GTTGTTAACG 


TAAACGAACC 


GGCAGCTAAT 


GGTGcTGGCG 


CATTTACAAT 


9840 




TGACCACGTT GTAAAAAGTA ATTCTACACA TAATGCAAGT GATGCAGTTT ATAAAGCACA 


9900 


45 


GTTATACTTA ACGCCATATG 


GTCCAAAACA ATATGTTGAA 


CATTTAAATC AAAATACAGG 


9960 




AAATACTACT 


GACGCTATTA ACATTTATTT 


TGTACCAAGT 


GACTTAGTGA ATCCAACAAT 


10020 




TTCAGTAGGT AATTACACTA ATCATCAAGT GTTCTCAGGT 


GAAACATTTA CAAATACTAT 


10080 


50 


TACAGCGAAT 


GATAACTTTG 


GTGTGCAATC 


TGTAACTGTA 


CCAAATACAT 


CACAAATTAC 


10140 




AGGTACTGTT 


GATAATAACC 


ATCAACATGT 


TTCTGCAACG 


GCACCAAATG TGACATCAGC 


10200 
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GTTCAATGTA ACAGTGAAAC CTTTGCGTGA TAAATATCGA GTTGGTACTT CATCAACGGC 10320 

TGCTAATCCT GTGAGAATTG CCAATATTTC C3AATAATGCG ACAGTATCAC AAGCTGATCA 103 BO 

5 AACGACAATT ATTAATTCGT TAACGTTTAC TGAAACAGTA CCAAATAGAA GTTATGCAAG 10440 

AGCAAGTGCG AATGAAATCA CTAGTAAAAC AGTTAGTAAT GTCAGTCGTA CTGGAAATAA 10500 

TGCCAATGTg CACAGTAACT GTTACTTATC AAGATGGAAC AACATCAACA GTGACTGTAC 10560 

10 

CTGTAAAGCA TGTCATTCCA GAAATCGTTG CACATTCGCA TTACACTGTA CAAGGCCAAG 10620 

ACTTCCCAGC AGGTAATGGT TCTAGTGCAT CAGATTACTT TAAGTTATCT AATGGTAGTG 10680 

ACATTGCAGA TGCAACTATT ACATGGGTAA GTGGACAAGC GCCAAATAAA GATAATACAC 10740 

75 

6TATTGGTGA AGATATAACT GTAACTGCAC ATATCTTAAT TGATGGCGAA ACAACGCCX3A lOBOO 

TTACGAAAAC AGCAACATAT AAAGTAGTAA GAACTGTACC GAAACATGTC TTTCAAACAG 10860 

CCAGAGGTGT TTTATACCCA GGTGTTTCAG ATATGTATGA TGCGAAACAA TAT6TTAAGC 10920 

CAGTAAATAA TTCTTGGTCG ACAAATGCGC AACATATGAA TTTCCAATTT GTTGGAACAT 10980 

ATGGTCCTAA CAAAGATGTT GTAGGCATAT CTACTCGTCT TATTAGAGTG ACATATGATA 11040 

25 ATAGACAAAC AGAAGATTTA ACTATTTTAT CTAAAGTTAA ACCTGACCCA CCTA6AATTG 11100 

ACGCAAACTC TGTGACATAT AAAGCAGGTC TTACAAACCA AGAAATTAAA GTTAATAACG 11160 

TATTAAATAA CTCGTCAGTA AAATTArTTA AAGCAGATAA TACACCATTA AATGTCACAA 11220 

30 ATATTACTCA TGGTAGCGGT TTTAGTTCGG TTGTGACAGT AAGTGACX3CG TTACCAAATG 11280 

GCGGAATTAA AGCAAAATCT TCAATTTCAA TGAACAATGT GACGTATACG ACGCAAGACG 11340 

AACATGGTCA AGTTGTTACA GTAACAAGAA ATGAATCTGT TGATTCAAAT GACAGTGCAa 11400 

^ CAGTAACAGT GACACCACAA TTACAAGCAA CTACTGAAGG CGCTGTATTT ATTAAAGGTG 11460 

GCGASSGTTT TGATTTCGGA CACGTAGAAA GATTTATTCA AAACCCGCCA CATGGGGCAA 11520 

CGGTTGCATG GCATGATAGT CCAGATACAT GGAAGAATAC AGTCGGTAAC ACTCATAAAA 11580 

40 

CTGCGGTTGT AACATTACCT AATGGTCAAG GTACGCGTAA TGTTGAAGTT CCAGTCAAAG 11640 

TTTATCCAGT TGCTAATGCA AAGGCGCCAT CACGTGATGT GAAAGGTCAA AATTTGACTA 11700 

ATGGAACXK5A TGCGATGAAC TACATTACAT TTGATCCAAA TACAAACACA AATGGTATCA 11760 

45 

CTGCAGCATG GGCAAATAGA CAACAACCAA ATAACCAACA AGCAGGCGTG CAACATTTAA 11820 

ATGTCGATGT CACATATCCA GGTATTTCAG CTGCTAAACG AGTTCCTGTT ACTGTTAATG 11880 

50 TATATCAATT TGAATTCCCT CAAACTACTT ATACGACAAC GGTTGGAGGC ACTTTAGCAA 11940 

GTGGTACGCA AGCATCAGGA TATGCACATA TGCAAAATGC TACTGGTTTA CCAACAGATG 12000 
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TGAATAAACC GAATGTGGCT AAAGTCGTTA ACGCAAAATA TGACGTCATC TATAACGGAC 12120 

ATACTTTTGC AACATCTTTA CCAGCGAAAT TTGTAGTAAA AGATGTGCAA CCAGCGAAAC 12180 

CAACTGTGAC TGAAACAGCG GCAGGAGCGA TTACAATTGC ACCTGGAGCA AACCAAACAG 12240 

TGAATACACA TGCCGGTAAC GTAACGACAT ACGCTGATAA ATTAGTTATT AAACGTAATG 123 00 

GTAACGTTGT GACGACATTT ACACGTCGCA ATAATACGAG TCCATGGGTG AAAGAAGCAT 12360 

CTGCAGCAAC TGTAGCAGGT ATTGCTGGAA CTAATAATGG TATTACTGTT GCAGCAGGTA 12420 

CTTTCAACCC TGCTGATACA ATTCAAGTTG TTGCAACGCA AGGAAGCGGA GAGACAGTGA 12480 

GTGATGAGCA AC6TAGTGAT GATTTCACAG TTGTCTGCACC ACAACCGAAC CAAGCGACTA 12540 

CTAAGATTTG GCAAAATGGT CATATTGATA TCACGCCTAA TAATCCATCA GGACATTTAA 12600 

TTAATCCAAC TCAAGCAATG GATATTGCTT ACACTGAAAA AGTGGGTAAT GGTGCAGAAC 12660 

ATAGTAA6AC AATTAATGTT GTTCGTGGTC AAAATAATCA ATGGACAATT GCGAATAAGC 12720 

CTGACTATGT AACGTTAGAT GCACAAACTG GTAAAGTGAC GTTCAATGCC AATACTATAA 12780 

AACCAAATTC ATCAATCACA ATTACTCCGA AAGCAGGTAC AGGTCACTCA GTAAGTAGTA 12840 

ATCCftAGTAC ATTAACTGCA CCGGCAGCTC ATACTGTCAA CACAACTGAA ATTGTGAAAG 12900 

ATTATGGTTC AAATGTAACA GCAGCTGAAA TTAACAATGC AGTTCaAGTT GCTAATAAAC 12 960 

GTACTGCAAC GATTAAAAAT GGCACAGCAA TGCCTACTAA TTTAGCTGGT GGTAGCACAA 13020 

CGACGATTCC TGTGACAGTA ACTTACAATG ATGGTAGTAC TGAAGAAGTA CAAGAGTCCA 13080 

TTTTCACAAA AGCGGATAAA CGTGAGTTAA TCACAGCTAA AAATCATTTA GATGATCCAG 13140 

TAAGCACTGA AGGTAAAAAG CCAGGTACAA TTACGCAGTA CAATAATGCA ATGCATAATG 13200 

CGCAACAACA AATCAATACT GCGAAAACAG AAGCACAACA AGTGATTAAT AATGAGCGTG 13260 

CAACACCACA ACAAGTTTCT GACGCACTAA CTAAAGTTCG TGCAGCACAA ACTAAGATTG 13320 

ATCAAGCTAA AGCATTACTT CAAAATAAAG AAGATAATAG CCAATTAGTA ACGTCTAAAA 13380 

ATAACTTACA AAGTTCTGTG AACCAAGTAC CATCAACTGC TGGTATGACG CAACAAAGTA 13440 

TTGATAACTA TAATGCGAAG AAGCGTGAAG CAGAAACTGA AATAACTGCA GCTCAACGTG 13500 

TTATTGACAA TGGCGATGCA ACTGCACAAC AAATTTCAGA TGAAAAACAT C6TGTCGATA 13560 

ACGCATTAAC AGCATTAAAC CAAGCGAAAC ATGATTTAAC TGCAGATACA CATGCCTTAG 13620 

AGCAAGCAGT GCAACAATTG AATCGCACAG GTACAACGAC TGGTAAGAAG CCGGCAAGTA 13680 

TTACTGCTTA CAATAATTCG ATTCGTGCAC TTCAAAGTGA CTTAACAAGT GCTAAAAATA 13740 

GCGCTAATGC TATTATTCAA AAGCCAATAA GAACAGTACA AGAAGTGCAA TCTGCGTTAA 13800 
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CTCATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 13920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCX3G 13980 

GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 14040 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 14220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 14280 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 14340 

AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 14400 

AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 14460 

ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 14520 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 14580 

ATTTAGATCA TGCACGTCAA GCTTTAACAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 14640 

CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14700 

TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14760 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14820 

CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG CCAGCGTTAA 14880 

CAACATTACA TGGTGCATCT AACTTAAACC AAGCACAACA AAATAATTTC ACGCAACAAA 14940 

TTAATGCTGC TCAAAATcAT GctGCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 15000 

ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 15060 

AAAATTACAC TGACGCAACA CCAGCTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 15240 

CGAAAACAGA AGCAACAAAT GCGATTACGC ATGCAAGTGA TTTAAACCAA GCACAAAAGA 15300 

ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 

AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACX3TGGCGTT GCTAATCATA 15420 

ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 15480 

ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

CACCAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

^ GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTATGCG GATGCAGATA 15840 

CAGCTAAACA AAATGCATAT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15960 

10 

ATAAAAATGC ATTAAATGGT TATGAAAAAT TAGCACAATC TAAAACAGAT GCTGCAAGAG 16020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 16080 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 16140 

IS 

CA)cCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGOGA 16260 

20 AAGATATTTT AAATAAATCA AATGGTCJ^AA ATAAAACGAA AGATCAAGTT ACTGAAGCX3A 16320 

TGAATCAAGT GAATTCTGCT AAAAATAACT TAGATGGTAC GCGTTTATTA GATCAAGCGA 16380 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GAC6CATTTA ACAACTGCAC AAAAAACGAA 16440 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 16620 

CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATCCAGAAA TGAATCCAAG 16680 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 16740 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 16800 

35 

AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 16860 

TGAXACTGTA AAACAAAATG CGCAACATCT AGACCAAGCT ATGGCTAGCT TACAGAATGG 16920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16980 

40 

ACAACAAGAG TATGATAATG CTATTACTGC AGCGAAAGCG ATTTTAAATA AATCGACAGG 17040 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AGCATTACAA CGTGTTAATA ATGCGAAAGA 17100 

45 TGCATTGAAT GGTGATGCAA AATTAATTGC AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCTCAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

50 GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TTAGCGAGCC AAAACTTCTT 17340 

GGATGCTGAT GAGCAAAAAC GTAATGCATA CAATCAAGCT GTATCAGCAG CCGAAACCAT 17400 
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TGTTAATAAT GCGAAACATG CATTAAATGG TACGCAAAAC TTAAACAATG CGAAACAAGC 17520 

AGCGATTACA GCAATCAATG GCGCATCTGA TTTAAATCAA AAACAAAAAG ATGCATTAAA 17580 

AGCACAAGCT AATGGTGCTC AACGCGTATC TAATGCACAA GATGTACAGC ACAATGC6AC 17640 

TGAACTGAAC ACGGCAATGG GCACATTAAA ACATGCCATC GCAGATAAGA CGAATACGTT 17700 

AGCAAGCAGT AAATATGTTA ATGCCGATAG CACTAAACAA AATGCTTACA CAACTAAAGT 17760 

TACCAATGCT GAACATATTA TTAGCGGTAC GCCAACGGTT GTTACGACAC CTTCAGAAGT 17 820 

AACAGCTGCA GCTAATCAAG TAAACAGCX3C GAAACAAGAA TTAAATGGTG ACX3AAAGATT 17880 

ACGTGAAGCA AAACAAAACG CCAATACTGC TATTGATGCA TTAACACAAT TAAATACACC 17940 

TCAAAAAGCT AAATTAAAAG AACAAGTGGG ACAAGCCAAT AGATTAGAAG ACX3TACAAAC 18000 

TGTTCAAACA AATGGACAAG CATTGAACAA TGCAATGAAA GGCTTAAGAG ATAGTATTGC 1B060 

TAACGAAACA ACAGTCAAAA CAAGTCAAAA CTATACAGAC GCAAGTCCGA ATAACCAATC 18120 

AACATATAAT AGCGCTGTGT CAAATGCGAA AGGTATCATT AATCAAACTA ACAATCCGAC 18180 

TATGGATACT AGTGCGATTA CCCAAGCTAC AACACAAGTG AATAATGCTA AAAATGGTTT 18240 

25 AAACXXSTGCr GAAAACTTAA GAAATGCACA AAACACTGCT AAGCAAAACT TAAATACATT 18300 

ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC GTGCAGGTCA 18360 

TGTGAGTGAG GTAACTGCTA CTAAAAATGC AGCAACTGAG TTGAATACGC AAATGGGTAA 18420 

CTTGGAACAA GCTATCCATG ATCAAAACAC AGTTAAACAA AGTGTTAAAT TTACTGATGC 18480 

AGATAAAGCT AAACGTGATG CGTATACAAA TGCGGTAAGC AGAGCTGAAG CAATTCTGAA 18540 

TAAAACGCAA GGTGCAAATA CGTCTAAACA AGATGTTGAA GCGGCTATTC AAAATGTTTC 18600 

AAGTGCTAAA AATGCATTGA ATGGTGATCA AAACGTTACA AATGCGAAGA ATGCAGCTAA 18660 

AAA15CATTA AATAACTTAA CGTCAATTAA TAATGCACAA AAACGTGACT TAACAACTAA 18720 

AATTGATCAA GCAACAACTG TAGCTGGTGT TGAAGCTGTA TCTAATACGA GTACACAATT 18780 

GAAtACAGCG ATGGCTAACT T6CAAAATGG TATTAATGAT AAAACAAATA CACTAGCAAG 18840 

TGAAAACTAT CATGATGCTG ATTCAGATAA GAAAACTGCT TATACTCAAG CCGTTACGAA 18900 

CX5CAGAAAAT ATTTTAAATA AAAATAGTGG ATCAAATTTA GACAAAACTG CCGTTGAAAA 18960 

CGCGTTGTCA CAAGTTGCTA ATGCGAAAGG TGCCCTAAAT GGTAACCATA ATTTAGAGCA 19020 

AGCTAAATCA AATGCAAACA CTACTATAAA CGGACTTCAA CATTTAACAA CTGCTCAAAA 19080 

SO AGATAAATTG AAACAACAAG TGCAACAAGC ACAAAATGTT GCAGGTGTAG ATACTGTTAA 1914 0 

ATCAAGTGCC AACACATTAA ATGGTGCTAT GGGTACGTTA AGAAATAGCA TACAAGATAA 19200 
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TAACAATGCT GTTGATAGTG CTAATGGTGT CATTAATGCA ACAAGCAATC CAAATATGGA 19320 

TGCTAATGCA ATTAACCAAA TCGCTACACA AGTGACATCA ACGAAAAATG CATTAGATGG 19380 

^ TACACATAAT TTAACGCAAG CGAAACAAAC AGCAACAAAT GCCATCGATG GTGCTACTAA 19440 

CTTAAATAAA GCGCAAAAAG ATGCGTTAAA AGCACAAGTT ACAAGTGCGC AACGTGTTGC 19500 

AAATGTAACA AGTATCCAAC AAACTGCAAA TGAACTTAAT ACAGCTATGG GTCAATTACA 19560 

10 

ACATGGTATT GATGATGAAA ATGCAACAAA ACAAACTCAA AAATATCGTG ACGcTGAACA 19620 

AAGTAAGAAA ACTGCTTATG ATCAAGCTGT AGCTGCTGCG AAAGCAAITT TAAATAAACA 196B0 

AACAGGTTCA AATTCAGATA AAGCAGCAGT TGACC6TGCA TTACAACAAG TAACAAGTAC 19740 

IS 

GAAAGATGCA TTGAATGGTG ATGCAAAACT GGCAGAAGCG AAAGCGGCAG CTAAACAAAA 19800 

CTTAGGCACT TTAAACCATA TTACGAATGC ACAACGTACT GACTTAGAAG GCCAAATCAA 19860 

2^ TCAAGCGACG ACTGTTGATG GCGTTAATAC TGTAAAAACA AATGCCAATA CATTAGACGG 19920 

CGCAATGAAT AGCTTACAAG GTTCAATCAA TGATAAAGAT GCGACATTAA GAAATCAAAA 19980 

TTATCTTGAT GCGGATGAAT CAAAACGAAA TGCATATACG CAAGCTGTCA CA6CGGCTCA 20040 

25 AGGCATTTTA AATAAACAAA CTGGTGGTAA CACATCTAAA GCAGACGTTG ATAATGCATT 20100 

AAATGCAGTT ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 20160 

AACTTCAGCA ACAAATACGA TTGATGGTTT ACCTAACTTA ACACAATTAC AAAAAGAC7VA 20220 

30 CTTGAAGCAT CAAGTTGAaC AAGCGCAAAA TGTAGCAGGT GTAAATGGTG TTAAAGATAA 20280 

AGGTAATACG TTAAATACTG CCATGGGTGC ATTACGTACA AGTATCCAAA ATGATAATAC 20340 

GACGAAAACA AGTCAAAATT ATCTTGATGC ATCTGACAGC AACAAAAATA ATTACAATAC 20400 

35 

TGCTGTAAAT AATGCAAATG GTGTTATTAA TGCAACGAAC AATCCAAATA TGGATGCTAA 20460 

TGCGATTAAT GGCATGGCAA ATCAAGTCAA TACAACAAAA GCAGCGTTAA ATGGTGCACA 20520 

AAACTTAGCT CAAGCTAAAA CAAATGCGAC GAACACAATT AACAACGCAC ATGACTTAAA 20580 

40 

CCAAAAACAA AAAGATGCAT TAAAAACACA AGTTAACAAT GCACAACGTG TATcTGATGC 20640 

AAATAACGTT CAACACACTG CAACTGAATT GAACAGTGCG ATGACAGCAC TTAAAGCAGC 20700 

^ TATTGCTGAT AAAGAAAGAA CAAAAGCAAG CGGTAATTAT GTCAATGCTG ATCAAGAAAA 20760 

ACGTCAAGCG TATGATTCAA AAGTGACTAA CGCTGAAAAT ATCATTAGTG GTACACCGAA 20820 

TGCGACATTA ACAGTCAATG ACGTAAATAG TGCGGCATCA CAAGTCAATG CGGCTAAAAC 20880 

SO AGCATTAAAT GGTGATAACA ACTTACGTGT AGCGAAAGAG CATGCCAACA ATACAATTGA 20940 

CGGCTTAGCA CAATTGAATA ATGCACAAAA AGCAAAATTA AAAGAACAAG TTCAAAGTGC 21000 
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GAAAGGCTTA AGAGATAGTA TTGCGAATGA AGCAACAATT AAAGCAGGTC AAAACTACAC 
TGACGCAAGT CCAAATAATC GTAACGAGTA CGACAGTGCA 6TTACTGCAG CAAAAGCAAT 
CATTAATCAA ACATCGAACC CAACGATGGA ACCAAATACT ATTACGCAAG TAACATCACA 
AGTGACAACT AAAGAACAGG CATTAAATGG TGCX3CGAAAC TTAGCTCAAG CTAAGACAAC 
TGCGAAAAAC AACTTGAATA ACTTAACATC AATTAACAAT GCACAAAAAG ATGCGTTAAC 
GCGTAgcATT GATGGTGCAA CAACAGTAGC TGGTGTAAAT CAAGAAACTG CAAAAGCAAC 
AGAATTAAAT AACGCAATGC ATAGTTTACA AAATGGTATC AATGATGAGA CACAAACAAA 
ACAAACTCAG AAATACCTAG ATGCAGAGCC AAGTAAGAAA TCAGCTTATG ATCAAGCAGT 
AAATGCAGCG AAAGCAATTT TAACAAAAGC TAGTGGTCAA AATGTAGACA AAGCAGCAGT 
TQAACAAGCA TTGCAAAATG TGAACAGTAC GAAGACGGCG TTGAACGGTC ATGCGAAATT 
AAATGAAGCT AAAGCAGCTG CGAAACAAAC GTTAGGTACA TTAACACACA TTAATAATGC 
ACAACGTACA GCGTTAGACA ATGAAATTAC ACAAGCAACA AATGTTGAAG GTGTTAATAC 
AGTTAAAGCC AAAGCGCAAC AATTAGATGG TGCTATGGGT CAATTAGAAA CATCAATTCG 
TGATAAAGAC ACGACGTTAC AAAGTCAAAA TTATCAAGAT GCTGATGATG CTAAACGAAC 
TGCTTATTCT CAAGCAGTAA ATGCAGCAGC AACTATTTTA AATAAAACAg CTGGCGGTAA 
TACACCTAAA GCAGATGTTG AAAGAGCAAT GCAAGCTGTT ACACAAGCAA ATACTGcATT 
AAACGGTATT CAmAACTTAG ATCGTGCGAA ACArGCTGCT AACACAGCGA TTACAAATGC 
TTCGGACTTA AATACAAAAC mAAAAGAAGC ATTAAAAgCA CAAGTAACAA GTGCAGGACG 
TGTATCTGCA GCAAATGGTG TTGAACATAC TGCGACTGAA TTAAATACTG CGATGACAGC 
TTTAAAGCGT GCCATTGCTG ATAAAGCTGA GACAAAAGCT AGTGGTAACT ATGTCAATGC 
TGATffCGAAT AAACGTCAAG CATATGATGA AAAAGTTACA GCTGCCGAAA ATATCGTTAG 
TGGTACACCA ACACCAACGT TAACACCAGC AGATGTTACA AATGCAGCAA CGCAAGTAAC 
GAATGCTAAG ACGCAGTTAA ACGGTAATCA TAATTTAGAA GTAGCGAAAC AAAATGCTAA 
CACTGCAATT GATGGTTTAA CTTCTTTAAA TGGTCCGCAA AAAGCAAAAC TTAAAGAACA 
AGTGGGTCAA 6CGACGACGT TGCCAAATGT TCAAACTGTT CGTGATAATG CACAAACATT 
AAACACTGCA ATGAAAGGTC TACGAGATAG CATTGCGAAT GAAGCAACGA TTAAAGCAGG 
TCAAAACTAC ACAGATGCAA GTCAAAACAA ACAAACTGAC TACAACAGTG CAGTCACTGC 
AGCAAAAGCA ATCATTGGTC AAACAACTAG TCCATCAATG AATGCGCAAG AAATTAATCA 
AGCGAAAGAC CAAGTGACAG CTAAACAACA AGCGTTAAAC GGTCAAGAAA ACTTAAGAAC 
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AGATGCAGTG AAACGTCAAA TCGAAGGTGC AACGCATGTT AATGAAGTAA CACAAGCACA 22920 

AAATAATGCG GATGCaTTAA ATACAGCTAT GACGAACTTG AAAAATGGTA TTCAAGATCA 22980 

GAATACGATT AAGCAAGGTG TTAACTTCAC TGATGCCGAC GAAGCGAAAC GTAATGCATA 23040 

TACAAATGCA GTGACGCAAG CTGAACAAAT TTTAAATAAA GCACAAGGTC CAAATACTTC 23100 

AAAAGACGGT GTCGAAACTG CGTTAGAaAA TGTACAACGT GCTAAAAACG AATTGAACGG 23160 

TAATCAAAAT GTTGCGAACX5 CTAAGACAAC TGCGAAAAAT GCATTGAATA ACCTAACATC 23220 

AATTAATAAT GCACAAAAAG AAGCATTGAA ATCACAAATT GAAGGTGCGA CAACAGTTGC 23280 

AGGTGTAAAT CAAGTGTCTA CAACX^OCATC TGAATTAAAT ACAGCAATGA GCAACTTACA 23340 

AAATGGTATT AATGATGAAG CAGCTACAAA AGCAGCGCTT AATGGTACTC AAAACCTTGA 23400 

AAAAGCTAAA CAACACGCAA ATACAGCAAT TQACGGTTTA AGCCATTTAA CAAATCCACA 23460 

AAAAGAGGCA TTAAAACAAT TGGTACAACA ATCGACTACT GTTGCAGAAG CACAAGGTAA 23520 

T6AGCAAAAA GCAAACAATG TTGATGCAGC AATGGACAAA TTACGTCAAA GTATTGCAGA 23580 

TAATGCGACA ACAAAACAAA ACCAAAATTA TACTGATGCA A6TCAGAATA AAAAGGATGC 23640 

GTACAATAAT GCTGTCACAA CTGCACAAGG TATTATTGAT CAAACTACAA GTCCAACTTT 23700 

AGATCCGACT GTTATCAATC AAGCTGCTGG ACAAGTAAGC ACAACTAAAA ATGCATTAAA 23760 

TGGTAATGAA AACCTAGAGG CAGCGAAACA ACAAGCGTCA CAATCATTAG GTTCATTAGA 2 3820 

TAACTTAAAT AATGCGCAAA AACAAACAGT TACTGATCAA ATTAATGGCG CGCATACTGT 23880 

TGATGAAGCA AATCAAATTA AGCAAAATGC GCAAAACTTA AATACAGCGA TGGGTAACTT 23940 

GAAACAAGCG ATAGcTGACA AAGATGCTAC GAAAGCQACA GTTAACTTCA CTGATGCAGA 24000 

TCAAGCAAAA CAACAAGCAT ATAACaCTGC TGTTACAAAT GCTGAAAATA TCATTTCAAA 24060 

AGCTAATGGC GGCAATGCAA CACAAGCTGA AGTTGAACAA GCAATCAAAC AAGTTAATGC 24120 

TGCAAAACAA GCATTAAATG GTAATGCCAA CGTTCAACAT GCAAAAGACG AAGCAACAGC 24180 

ATTAATTAAT AGCTCTAATG ACCTTAACCA AGCACAAAAA GACGCATTAA AACAACAAGT 24240 

TCAAAATGCA ACTACTGTAG CTGGTGTAAA CAATGTTAAA CAAACAGCAC AAGAGTTAAA 24300 

CAATGCTATG ACACAATTAA AACAAGGCAT TGCAGATAAA GAACAAACAA AAGCTGATGG 24360 

TAACTTTGTC AATGCAGATC CTGATAAGCA AAATGCATAT AATCAAGCAG TAGCQAAAGC 24420 

TGAAGCATTA ATTAGTGctA CGCCTGATGT TGTCGTTACA CCTAGCGAAA TTACTGCAGC 24480 

GTTAAATAAA GTTACGCAAG CTAAAAATGA TTTAAATGGT AATACAAACT TAGCAACGGC 24540 

GAAACAAAAT GTTCAACATG CTATTGATCA ATTGCCAAAC TTAAACCAAG CGCAACGTGA 24600 
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AGCGGCGACA ACGCTTAATG ACGCGATGAC ACAATTGAAA CAAGGTATTG CGAATAAAGC 24720 

ACAAATTAAA GGTAGCGAGA ACTATCACGA TGCTGATACT GACAAGCAAA CAGCATATQA 24780 

5 TAATGCAGTA ACAAAAGCAG AAGAATTGTT AAAACAAACA ACAAATCCAA CAATGGATCC 24840 

AAATACAATT CAACAAGCAT TAACTAAAGT GAATGACACA AATCAAGCAC TTAACGGTAA 24 900 

TCAAAAATTA GCTGATGCCA AACAAGATGC TAAGACAACA CTTGGTACAC TAGATCATTT 24960 

AAATGATGCT CAAAAACAAG CGCTAACAAC TCAAGTTGAA CAAGCACCAG ATATTGCAAC 25020 

AGTTAATAAT GTTAAGCAAA ATGCTCAAAA TCTGAATAAT GCTATGACTA ACTTAAACAA 25080 

TGCATTACAA GATAAAACTG AGACATTAAA TAGCATTAAC TTTACTGATG CAGATCAAGC 25140 

IS 

TAAGAAAQAT GCTTATACTA ATGCGGTTTC ACATGCAGAA GGTATTTTAT CTAAAGCAAA 25200 

TGGCAGCAAT GCAAGTCAAA CTGAAGTGGA ACAAGCGATG CAACGTGTGA ACGAAGCGAA 25260 

ACAAGCATTG AATGGTAATG ACAATGTACA ACGTGCAAAA GATGCAGCGA AACAAGTGAT 25320 

TACAAATGCA AATGATTTAA ATCAAGCAAT GACACAATTG AAACAAGGTA TTGCAGATAA 25380 

AGACCAAACT AAAGCAAATG GTAACTTTGT CAATGCTGAT ACTGATAAGC AAAATGCTTA 25440 

25 CAACAATGCG GTAGCACATG CTGAACAAAT AATTAGTGGT ACACCAAATG CAAACGTGGA 25500 

TCCACAACAA GTGGCTCAAG CGTTACAACA AGTGAATCaA GCTAAGGGTG ATTTAAACGG 25560 

TAACCATAAC TTACAAGTTG CTAAAGACAA TGCAAATACA GCCATTGATC AGTTACCAAA 25620 

30 CTTAAATCAA CCACAAAAAA CAGCATTAAA AGACCAAGTG TCX5CATGCAG AACTTGTTAC 25680 

AGGTGTTAAT GCTATTAAGC AAAATGCTGA TGCGTTAAAT AATGcAATGG GTACATTGAA 2574 0 

ACAACAAATT CAAGCGAACA GTCAAGTACC ACAGTCAGTT GACTTTACAC AAGCGGATCA 25800 

AGACAAACAA CAAGCATATA ACAATGCGGC TAACCAAGCG CAACAAATCG CAAATGGCAT 25860 

ACCAACACCT GTATTGACGC CTGATACAGT AACACAAGCA GTGACAACTA TGAATCAAGC 25920 

GAAAGATGCA TTAAACGGTG ATGAAAAATT AGCACAAGCG AAACAAGAAG CTTTAGCAAA 25980 

40 

TCTTGATACG TTACGCGATT TAAATCAACC ACAACGTGAT GCATTACGTA ACCAAATCAA 26040 

TCAAGCACAA GCGTTAGCTA CAGTTGAACA AACTAAACAA AATGCACAAA ATGTGAATAC 26100 

aGCaATGAGT AACTTGAAAC aAGGTATTGC aAACAAAGAT ACTGTCAAAG CAAGTGAGAA 26160 

45 

CTATCATGAT GCTGATGCCG ATAAGCAAAC AGCATATACA AATGCAGTGT CTCAAGCQGA 26220 

AGGTATTATC AATCAAACGA CAAATCCAAC GCTTAACCCA GATGAAATAA CACGTGCATT 26280 

SO AACTCAAGTG ACTGATGCTA AAAATGGCTT AAACGGTGAA GCTAAATTGG CAACTGAAAA 26340 

GCAAAATGCT AAAGATGCCG TAAGTGGGAT GACX3CATTTA AACGATGCTC AAAAACAAGC 26400 
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AGCAACGAGC CTAGATCAAG CAATGGATCA ATTATCACAA GCTATTAATG ATAAAGCTCA 26520 

AACATTAGCG GACGGTAATT ACTTAAATGC AGATCCTGAC AAACAAAATG CGTATAAACA 26580 

S GGCAGTAGCA AAAGCTGAAG CATTATTGAA TAAACAAAGT GGTACTAATG AAGTACAAGC 26640 

ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA ATGGTAATGA 26700 

CAATTTGGCA AATGCAAAAC AACAAGCAAA ACAACAATTG GCGAACTTAA CACACTTAAA 26760 

10 

TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCGCCACTTG TTACAGATGT 26820 

CACTACGATT AATCAAAAAG CACAAACGTT AGATCATGCG ATGGAATTAT TAAGAAATAG 26880 

TGTTGCGGAT AATCAAACGA CATTAGCGTC TGAAGATTAT CATGATGCAA CTGCGCAAM 26940 

IS 

ACAAAATGAC TATAACCAAG CTGTAACAGC TGCTAATAAT ATAATTAATC AAACTACATC 27000 

GCCTACGATG AATCCAGATG ATGTTAATGG TGCAAOGACA CAAGTGAATA ATAC6AAAGT 27060 

TGCATTAGAT GGTGATGAAA ACCTTGCAGC AGCTAAACAA CAAGCAAACA ACAGACTTGA 27120 

TCAATTAGAT CATTTGAATA ATGCX3CAAAA GCAACAGTTA CAATCACAAA TTACGCAATC 27180 

ATCTCATATT GCTGCAGTTA ATGGTCACAA ACAAACAGCA GAATCTTTAA ATACTGCGAT 27240 

25 GGGTAACTTA ATTAATGCGA TTGCAGATCA TCAAGCCGTT GAACAACGTG GTAACTTCAT 27300 

CAATGCTGAT ACTGATAAAC AAACTGCTTA TAATACAGCG GTAAATGAAG CAGCAGCAAT 27360 

GATTAACAAA CAAACTGGTC AAAATGCGAA CCAAACAGAA GTAGAACAAG CTATTACTAA 27420 

50 AGTTCAAACA ACACTTCAAG CGTTAAATGG AGACCATAAT TTACAAGTTG CTAAAACAAA 27480 

TGCGACGCAA GCAATTGATG CTTTAACAAG CTTAAATGAT CCTCAAAAAA CAGCATTAAA 27540 

AGACCAAGTT ACAGCTGCAA CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 27600 

35 

TACGCTTAAC CAAGCAATGC ATGGTTTAAG ACAGAGCATT CAAGATAACG CAGCAACTAA 27660 

AGCflAATAGC AAATATATCA ACGAAGATCA ACCAGAGCAA CAAAACTATG ATCAAGCTGT 27720 

TCAAGCCGCA AATAATATTA TCAATGAACA AACTGCAACA TTAGATAATA ATGCGATTAA 27780 

40 

TCAAGCAGCG ACAACTGTGA ATACAACGAA AGCAGCATTA CATGGTGATG TGAAGTTACA 27840 

AAATGATAAA GATCATGCTA AGCAAACGGT TAGTCAATTA GCACATCTAA ACAATGCACA 27900 

AAAACATATG GAAGATACX3T TAATTGATAG TGAAACAACT AGAACAGCAG TTAAGCAAGA 27960 

45 

TTTGACTGAA GCACAAGCAT TAGATCAACT TATGGATGCA TTACAACAAA GTATTGCTGA 28020 

CAAAGATGCA ACACGTGCGA GCAGTGCATA TGTCAATGCA GAACCGAATA AAAAACAATC 28 080 

50 CTATGATGAA GCAGTTCAAA ATGCTGAGTC TATCATTGCA GGATTAAATA ATCCAACTAT 28140 

CAATAAAGGT AATGTATCAA GTGCGACTCA AGCAGTAATA TCATCTAAAA ATGCATTAGA 28200 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2B320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 28380 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAAGGAGGA 28440 

TCAAGCGCAA AAAQATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 28500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 28560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 28620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 28740 

CCAAGCAATG GAAGCTTTAC GTAATAGCAT TCAAGATCAA CAGCAAACXK? AAGCGGGTAG 28800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 28860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 28920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACX3A 28980 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATTkATC CX3CAACGTCA 29040 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 29100 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 29160 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 29220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 29280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 2 9340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 29400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 29460 

GGTTSCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 29520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACGAAA7 TACACTGAAG CGTCAACTGA 29580 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 29640 

TGGTTCAAAT GCGAATAAAG ACGCTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 29700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 29760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 29820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTA6ATCAA GCAACGCAAT TGAATCAATC 29880 

TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 29940 

CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 30120 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 3 01 BO 

CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 30240 

TOGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 30300 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 30360 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 30420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 304 BO 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 30540 

ACAATTAGCA ATCCAACAAA TTAATAAT6C TGAAAC6CTA AATAAA6CAT CTCGAGCAAT 30600 

TAATAGAGCA ACTAAATTAG ATAAT^GCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 30660 

GCACCTTGGT GTTATCAGCA 6CACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 30720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG GTAATGCAAT 3 07 BO 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 30840 

TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 30900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 31020 

GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA ACTGTCAATT ACCAAAACGC 31080 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



45 



SO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 60 

TTAGCGATAG mAAATCGTTC ATTAAATGAT GATGAA^AAG CATTAAAATA TGTGCGTAAA 120 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA IBO 

TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 240 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACQAA ATGATTCAAC AGGAGATTTC 300 
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TAATCAGAGA AGGAATGAAC AGAAAT6ACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 480 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 540 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

GTACCAGGTC CTGTTGACTT TGAAA/^GGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 66 0 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 720 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 780 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 840 

TGAAATCGTA CATGGTCATA ATGGCTCtGG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CSATCAACGA TTTaAATGTA ATTGTGGTC6 TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

20 GACAGGC3TT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 1020 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTrGATG , CGGCAAAAGC 1080 

TGGTGACCAA TTCTGTATTT TCAITACTGA AAAGGTTGCA AACTATATTG GATATTTATG 1140 

2S TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 1200 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 1260 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 132 0 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 1380 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 1440 

6ATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT OCAAQTGATA 1560 

CATGSATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CXSACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CA6TACAAAA ACATTTAGAA 1680 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 1740 

C3GTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 1800 

4S AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 1980 

TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCT6AAAAG TTAAmCGAAG 2040 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 2220 
ATAAAATATT TGAATTAGAA GGC 2243 
5 (2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



75 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

TTGGnATCAT tyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTC 60 

CAATGTTTGC AATTTGTTGG GTTGCATATA TTCAATGGGA 6TCTACAATC GCTTCATTTA 120 

CACAATCTAT TAATATTTCa ATGGCACAAT ATAGTGTTTT AT6GACAATT AACGGAATAA 180 

TGATTTTAGT AGCACAACCA TTAATTAAAC CGATTCTCTA TCTGTTAAAA GGAAACTTAA 240 

AGAAGCAAAT GTTTGTCGGC ATCATCATTT TTATGTTGTC GTTCTTTGTC ACGAGTTTTG 300 

2S CCGAAAACTT TACAATATTT GTTGTCGGTA TGATTATTTT AACTTTTGGA GAAATGTTTG 360 

TATGGCCAGC AGTTCCAACT ATAGCCAATC AGTTAGCGCC AGATGGTAAG CAAGGACAGT 420 

ACCAAGGTTT TGTGAATTCA GCTGCTACAG TAGGAAAAGC ATTTGGTCCA TTTCTTGGTG 4 80 

GTGTATTAGT TGATGCGTTT AATATGCGCA TGATGTTTAT CGGTATGATG CTACTACTTG 540 

TATTTGCATT AATATTATTA ATGGTTTTCA AGGAGAATAA TACGCAACCT AAAAAAATAG 6 00 

ATGCATAATG AGTAAATAGA ATTAACGTTA TAGACTTGAA ATAAATGTCG TTATAACATA 660 

ATATTAATTT GTATAATTTA ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG 720 

AGAGCTAGTG TTAAGGACTA AATGTAAATC GTATTAATTT TAAATTGAAT GAATGACATC 780 

TCTTACTATT AAAATGAGTG CACAATTTTT GTGAAATAGG GTGGTAACGC GGCAAATGTC 840 

6TCCCTATGT AAATAGAATA GTTAGAGGTG TCTTTTTTAT TGAATAGGAG GAAATGTGTT 900 

GAATTACAAC CACAATCAAA TTGAAAAGAA ATGGcAAGAC TATTGGGACG AAAATAAAAC 960 

ATTTAAAACA AATGATAACT TAGGTCAAAA GAAATTTTAT GCTTTAGACA TGTTTCCATA 1020 

TCCATCAGGT GCTGGTTTAC ATGTTGGACA TCCTGAGGGc TATACAGCAA CAGATATCAT 1080 

TTCAAGATAT AAAAGAATGC AAGGATATAA TGTATTACAT CCGATGGGGT GGGATGCATT 1140 

SO CGGATTACCA GCAGAGCAAT ATGCTTTAGA CACTGGCAAC GACCCACGTG AATTTACAAA 1200 

GAAAAATATC CAAACTTTTA AACGACAAAT TAAAGAATTA GGGTTCA6TT ATGATTGGGA 1260 
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GTTATATAAC 


AAAGGTTTAG CATACGTTGA TGAAGTTGCA GTTAACTGGT 


GTCCAGCATT 


1380 




AGGCACTGTT 


TTATCTAACG AAGAAGTGAT TGATGGTGTC TCTGAACGTG 


GTGGACATCC 


1440 


5 


AGTTTATCGT AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG CAGATCAATT 


1500 




ATTAGCAGAT 


TTAGATGATT TAGATTGGCC TGAGTCTTTA AAAGATATGC 


AGCGCAATTG 


1560 


10 


GATTGGACGT 


TCTGAAGGGG CCAAAGTTTC ATTTGATGTA GATAATACGG 


AAGGAAAAGT 


1620 


AGAAGTATTT 


ACGACTAGAC CAGATACAAT CTATGGTGCA TCATTCTTAG 


TCTTAAGTCC 


1680 




TGAACATGCA 


TTAGTTAATT CAATTACAAC AGATGAATAT AAAGAAAAAG 


7AAAAGCTTA 


1740 


IS 


TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACOTACA GATTTAGCAA AAGATAAATC 


1800 


AGGTGTATTT 


ACTGGTGCAT ATGCAACTAA TCCTTTATCT GGTGAAAAAG 


TACAAATTTG 


1860 




GATTGCTGAT 


TATGTATTAT CAACATATGG TACTGGAGCA ATTATGGCAG 


TACCAGCGCA 


1920 


20 


TGATGACAGA GATTATGAAT TTGCTAAAAA GTTTGATTTG CCAATCATTG 


AAGTCATCGA 


1980 




AGGTGGAAAT 


GTTGAAGAAG CAGCATACAC TGGTGAAGGT AAACATATTA ATTCTGGTGA 


2040 




ACTTGATGGT 


TTAGAAAATG AAGCGGCAAT TACTAAAGCT ATTCAATTAT 


TAGAGCAAAA 


2100 


2S 


AGGTGCTGGC 


GAAAAGAAAG TTAATTACAA ATTAAGAGAT TGGTTATTCA 


GTCGTCAGCG 


2160 




TTATTGGGGC 


GAACCAATTC CTGTCATTCA TTGGGAAGAT GGAACAATGA 


CAACTGTTCC 


2220 




TGAAGAAGAG 


CTACCATTGT TGTTACCTGA AACAGATGAA ATCAAGCCAT 


CAGGGACTGG 


2280 


30 


TGAGTCTCCA 


CTAGCTAATA TTGATTCATT TGTAAATGTT GTAGATGAAA AAACAGGTAT 


2340 




GAAAGGAC6T 


CGTGAAACAA ATACAATGCC ACAATGGGCA GGTAGTTGTT 


GGTATTATTT 


2400 


35 


ACGTTACATC 


GATCCTAAAA ATGAAAATAT GTTAGCAGAT CCTGAAAAAT 


TAAAACATTG 


2460 


GTTACCTGTT 


GATTTATATA TCGGTGGAGT AGAACATGCG GTTCTTCACT 


TATTATATGC 


2520 




AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG AACCTTTCCA 


2580 


40 


AAAATTATTT 


AACCAAGGTA TGATTTTAGG AGAAGGTAAT GAGAAGATGA 


GTAAATCTAA 


2640 


AGGAAATGTA 


ATCAATCCTG ATGATATAGT ACAGTCTCAT GGTGCAGATA 


CTTTGCGTCT 


2700 




TTACGAAATG 


TTTATGGGAC CTTTAGATGC TGCAATTGCA TGGAGTGAAA 


AAGGATTAGA 


2760 


45 


TGGGTCTCGT 


CGATTCTTAG ATCGCGTATG GCGTTTAATG GTAAATGAAG 


ATGGGACATT 


2820 




GAGTTCAAAA 


ATTGTAACTA CAAATAATAA ATCTTTAGAT AAAGTTTATA 


ACCAAACTGT 


2880 




TAAAAAGGTA 


ACAGAAGACT TTGAAACATT AGGATTTAAT ACTGCTATTA GTCAATTAAT 


2940 


SO 


GGTATTTATT 


AATGAGTGTT ATAAAGTTGA TGAAGTTTAT AAACCTTACA 


TTGAAGGCTT 


3000 




CGTTAAAATG 


TTAGCACCTA TTGCACCACA TATCGGTGAA 6AATTATGGT 


CAAAATTAGG 


3060 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA 


TTAAAATTGC 


3180 




TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATCACA ATGTTAAAGC 


3240 


5 


GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT 


3300 




TGTAGCTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA 


CAGATGAATT 


3360 




AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA 


CTGATGAAGA 


3420 


10 


AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA 


TTCCGGATAA 


3480 




TTTAAATTCA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG 


3540 


IS 


CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG 


TCGAAGGCGG 


3600 


CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 


3660 




TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TOGGTTGCTT 


GGTGTTTTTT 


3720 


20 


GGTATGAATT ACTTTCTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT 


TTTATTAAGA 


3780 




TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 


3840 




GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 


3900 


25 


TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA 


3960 




GAGAACAGAC AACCAGGAGG AAAATGAAAT QAATTTGTTA AAGAAAAATA AATATAGTAT 


4020 




TAGGAAGTAT AAAGTAGGCA TATTCTCTAC TTTAATCGGA ACAGTTTTAT 


TACTTTCAAA 


4060 


30 


CCCAAATGGT GCACAAGCCT TAACTACGGA TAATAATGTA CAAAGCGATA 


CTAATCAAGC 


4140 




AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT AGAGGTTTAG 


CAAATAGTGC 


4200 




GCAGAATACA CCTAATCAAT CTGCAACAAC CAATCAAGCA ACGAATCAAG 


CATTGGTTAA 


4260 


35 


TCATAATAAT GGTAGTATAG TAAATCAAGC TACGCCAACA TCAGTGCAAT 


CAAGTACGCC 


4320 




TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA 


CAGTGTCAAA 


4380 


40 


CGCTAATAAT AATGATGTAG TGTCX3AATAA TACCX3CATTA AATGTACCAA 


CTAAAACAAA 


4440 


TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG 


TTCGTCATTC 


4500 




TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC 


CGAAAAAGAG 


4560 


45 


AAGTAGACGT GCGGCACCGG CAGATCCTAA TGCAACTCCA GCAGATCCAG 


CGGCTGCAGC 


4620 




GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT TACAGCGCCA TATACGCCAA 


CAACTGATCC 


4680 




TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 


4740 


SO 


TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 


4800 




CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG 


TAAGAACGAG 


4860 
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TCGTATACAT GGAACTGATA CGAATGACCA TGGCGATTTT AATGGTATCG AGAAAGCATT 49 BO 

AACAGTAAAT CCGAATTCTG AATTAATCTT TGAATTTAAT ACAAT6ACTA CTAAAAACGG 5040 

TCAAGGCGCA ACAAATGTTA TTATCAAAAA TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

CAAAATTCAA TTTGTACCTA AAAATGACGC AATAACAGAT GCGCGTGGCA TTTATCAACT 522 0 

AAAAGATGGT TACAAATACT ATAGCTTTGT TGACTCTATC GGACTTCATT CTGGGTCACA 52 BO 

TGTTTTTGTT GATVAGACGAA CAATGGATCC AACAGCAACA AATAATAAAG AGTTTACTGT 534 0 

AACAACATCA TTAAAQAATA ATGGTAATTC TGGTGCTTCT CTAGATACAA ATGACTTTGT 5400 

ATATCAAGTT CAATTACCTG AAGGTGTTGA ATATGTGAAC AATTCATTGA CTAAAGATTT 5460 

TCCAAGTAAC AATTCAGGCG TTGATGTTAA TGATATGAAT GTTACATATG ATGCAGCAAA 5520 

TCGTGTGATA ACAATTAAAA GTACTGGAGG AGGTACAGCA AACTCTCCGG CACGACTTAT 5580 

GCCTGATAAA ATACTCGATT TAAGATATAA ATTACGTGTA AATAATGTGC CGACACCAAG 5640 

AACAGTAACA TTTAACGAGA CATTAACGTA TAAAACATAT ACACAAGATT TCATTAATTC 5700 

25 AGCTGCAGAA AGTCATACTG TAAGTACAAA TCCATATACT ATCGATATCA TCATGAATAA 5760 

AGATGCATTA CAAGCCX3AAG TTGACAGACG TATTCAACAA GCTGATTATA CATTTGCX3TC 5820 

ATTAGATATC TTTAATGGTC TGAAACGACG CGCACAAACG ATTTTAGATG AAAATCGTAA 5880 

^0 CAATGTACCA TTAAATAAAA GAGTTTCTCA AGCATATATT GATTCATTAA CTAATCAAAT 5940 

GCAACATACG TTAATTCGAA GTGTTGATGC TGAAAATGCA GTTAATAAAA AAGTTGACCA 6000 

AATGGAAGAT TTAGTTAATC AAAATGATGA ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6060 

ACAAGTTATC GAGGAACATA AAAATGAAAT AATTGGTAAT ATTGGTGACC AAACGACTGA 6120 

TGATSGCGTT ACTAGAATCA AAGATCAAGG TATACAGACC TTAAGTGGGG ATACTGCAAC 6180 

ACCGGTTGTT AAACCAAATG CTAAAAAAGC AATACGTGAT AAAGCAACGA AACAAAGGGA 6240 

AATTATCAAT GCAACACCAG ATGCTACTGA AGACGAGATT CAAGATGCAC TAAATCAATT 6300 

AGCTAOGGAT GAAACAGATG CTATTGATAA TGTTAOGAAT GCTACTACAA ATGCTGACGT 6360 

TGAAACAGCT AAAAATAATG GCATCAATAC TATTGGAGCA GTTGTTCCTC AAGTAACTCA 6420 

TAAAAAAGCT GCAAGAGATG CAATTAACCA AGCAACAGCA ACGAAAAGAC AACAAATAAA 6480 

TAGTAATAGA GAAGCAACTC AGGAAGAGAA AAATGCAGCA TTGAACGAAT TAACTCAAGC 6540 

SO AACCAACCAT GCTTTAGAAC AAATCAATCA AGCAACAACA AATGCTAATG TTGATAACGC 6600 

CAAAGGAGAT GGTCTAAATG CCATTAATCC AATTGCTCCT GTAACTGTTG TTAAGCAAGC 6660 
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TGATGCGACT CAAGAAGAAA. GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6780 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6840 

^ TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6900 

TX3CCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6960 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCGAAGCA 7020 

10 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7080 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 7140 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACT6GC6CGA CTCGAGAAGA 7200 

IS 

GAAACAAGAA GCGATAAATC GTGTC7UVTAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 7380 

20 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTA6C AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATTAGCTGAA ATCAATGCTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 7680 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 7740 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7860 

35 

GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7920 

TCAAeTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8009 

40 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 60 
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AGATGAATGC TAACCATATT CATTCTGCTA AAGATGGTCG TGTTACTGCG ACAGCTGAAA 180 

TTATTCATCG AGGTAAGTCG ACACATGTAT GGGATATAAA AATTAAGAAT GACAAAGAAC 240 

AATTAAITAC AGTTATGCGT GGTACAGTTG CTATTAAACC TTTAAAATAA AAGAACTGCT 300 

AGCTGAAATG TTATGA6ATA TTCATAACTA CGGCTAGCAG TTTTTTTATG CGCTATATTG 360 

TTGTAGTTTT AGAAATGCTT GTTCAATGCG TTCGGCAGCT TTACGGCCAC CCATAACATT 420 

TCTACCAAAT GGTCCTAATT CTAAGTCTGC AAAGCATCCT GCGACAAATA GATTTGGTAT 4 80 

CCATTCTAAT TTTTCGGAAA TAACAGGGTA AITACATTCG TTGATAGGTG CATCATAATT 540 

TTGTATTAAT TGCTTAATAA GTGGTTGTGA CATAAAATCT TGTTCAAAAC CAGTTGCAAC 600 

CATAATCTGT TGATATGGAA CAGAATCATT TTCAGTGTTA ATTACACCAC CACTAATTTG 660 

AGTGATAGGT GTTTTATGCa CATTTATACG ACCATTTTTA ATATGTTTTT TAAGGCGTAA 720 

GTACAGTTCG TGAGGCATTG ATCCTTTATG ACGTTCGCGT TGTACAATGG CATTTCTTTC 780 

AGGCATGCTT TTAGTACTTA AAAATGAAGA CATATTTTTC GGACCTAACC AACCAGGATC 840 

AGCATCAAAG TCATGTATTT CAATATCTTT ATTTAGCCAT AAATGAATCT TTTTATCGTT 900 

25 ATCATGATTT AACAATTTAA GT6CAAGATG TGCAGCAGTa ATGCCX5CTAC CAACGATATG 960 

ATCGGTCTTA TCATATACTA CTTGATCAAG TTCTTTCTCG AAGATATGAT TTACATTCTG 1020 

TTTGTCTTTT AAAATGTCAG GCATAAACGG AATATTTGTA CTGCCTATTG CAATAACGAC 1080 

50 GCAATCTGTA GTGATAATTT GTCCATCTTC TAACTTGATA TGCCATTTGT CTTCTTGTTT 1140 

ATCTAAAGTT TGAACTAAAC CTTGAACCAA GCAATCCTCT AATTGATATT GTTTAGAAGC 1200 

ATGTGCAATA TGATCCATAA ACATTGTCAA TTCAGGTCGT TGATAAGGAC CATAAAAAGC 1260 

ATTTGTATAT TGGTGCTGTT TAGCGAATTG TTTTAGATGG AACGGTTGTG GATGTACGTG 1320 

ATGTACAATC GGTGATCTTA AATAAGGCAT TTCTATTCGA TTTGTATATG AGTTAAACCT 1380 

TTGGCAAAAA GTTTCGTGTG GGTCAATGAT TGITAATCGG TCTGTTGTTA ATCCGCTTGA 1440 

TAATAGTTTT TGTGCGATTG CAGTTCCCTG TATGCCACCG CCGATAATTG TCCAATCCAT 1500 

AATAAAACCT CTCTCTTTTT AAAACGTAAT AGTTACGATT TATAATTATT ATTATCATAA 1560 

TACATAACGA CATGAAAGGC AATTAAATTA AAGAGATATA TGTAGATAGG GCGAATCTGT 1620 

AGTCAAAGAA AAAATCATTG AAAAAGAGGT AACAATGTCA AAAGAwAACA GCAGTAAAAT 1680 

CATTCCTAAT TTGGAATCAT CTTACTGCTG TTTGTTGTTG ATTTATATTC ATGATTTTGT 174 0 

SO TATATAATCT ACAATTTTGT GTCTTTTAAG TCTTCCGAAA TTTCATCGAC TTTAGTCTTT 1800 

TTAGTATAAG GCGTTTTAAT ATTATATGCT GCTTTCATAA TCATATGACT TGAAAQAGGA 1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 1980 

CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTGCTAGG 2040 

5 GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

ATCATGTTCA ATCACCTTAC CTTTGTCCAT AAATTTAGAG AATACTGCAG TACCTAAAAA 2160 

AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 2220 

10 

GAATAATGCT ATAACTGCCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 2280 

GGCAAGTGAT GGGCCTAGCA CAACGCGAAT GAGCATAGCT AACATAGAAA TGACAACTAT 2340 

GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 2400 

IS 

ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 2460 

GCATGAATAT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 2520 

AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 2580 

ACAAAGAATC CTGGTTCATT TTTAATCGAA GGTTTAATAA TAATTTTCAA AACATCAAAA 2640 

TTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 2700 

25 ATGACATAAA ATCTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 2760 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG TAACTATTTG TCACAAACAA CCAAAACACT 2820 

GCGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCCTA ATACAGCTTT 2880 

AACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 2940 

ATCTGCTGAC AATCCATATA AAACAGTTAT CACAACTGCA ACGATTGCAA TCGTAGTTAA 3000 

ATATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGACCGA AAAAGCCTTC 3060 

35 

TAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 3120 

ACTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 

ACTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 3240 

40 

AGGATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA 6CAGTGCCTG TAATTTTAAT 3300 

CATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 3360 

^ AATAGCCCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCX5ACTAAGA TCACACCTAC 3420 

AGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG GCATATGCAA CAGCACCGAC 3430 

ACAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 3540 

50 ATTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 3600 

CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 3660 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 3780 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 384 0 

5 GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 3900 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3960 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 402 0 

10 

TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4080 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 4140 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

IS 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 4260 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4320 

AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 4380 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 4440 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTT6CG GGaTAGGaTC AACATAGCTT 4500 

2S TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4560 

AATAAATTTG CTGCATGTGT TAATAGTGTA 6TTCCCATAA CAATTCGTAT CAGACTTTTA 4620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 4680 

30 TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 474 0 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4860 

35 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4920 

TTTlicGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4980 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 5040 

40 

CAAAGACCAT TACXATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

^ ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 5340 

SO CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 5400 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 5460 
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CCAATTTAAG GTTTTCATTA CAGTATTACC TGACATCGTC GTTTTAATTA ATGTAAGCAT 7380 

ATAAATAAAT ATGACGATAG GGACAGGTAA TACGAACCAT CCTAAATGTA TACGTTTAAA 7440 

5 AAATCTATAC AGGATAGGAA TAATGAGTGC GAATATTAAC GGTAATATCA CCGCAATATG 7500 

TAACAAACTC ACTATGTTGT CCTCCTTTAA AAAATATTTA TGTTATTCAT TATACATGAA 7560 

TGATATAGTT CTGAAAAACG TACACACTCC TTGTTGTGCT TTATTTTCAG AaGTATTTAA 7620 

10 

ATAAGAAGAA ACACGTCATT TTTTATTTAA AATTTTCTTT GTATTGAAGT GAATAATCTT 7680 

CTTTTAAGCG TGCTAAACTA GCTAAAGACA TTTCAGCATG TTTTGTTTGC TGAGCTTTAA 7740 

GTTTAGTTTC TAAATCTGTA ATTGCTTGTT GAA6TGAATC TTCATAGCGC AATACATCAA 7800 

IS 

CATTGAAGTC GCGTAATTGT GAACGTTTCG TATAGOGTTT TTCAAAATGG CTTAATGCTT 7860 

TGCGGTCATG GAAAAATACA CCTTCAGTTT CAGTAGGGTT ATGTAAATCA CCTTGTTTCG 7920 

GGTGTTTGAT AACTTGTTCA ACTTTAACAA GGACATCGTC TCCATTTTCT TCAACAATCG 7980 

TGACACCATA GCTACCTGTT TTGTGTGAAA ATCGATATAG CTTCATGCTA TTTTCCTCCC 8040 

TTAAAAGTAT GTTAATATAT ATGTATCATA ACATGAATGG AGAATATAAA TGGCTAACTA 8100 

2S TCCACAGTTA AACAAAGAAG TACAACAAGG TGAAATCAAA GTGGTTATGC ACACAAATAA 8160 

AGGTGACATG ACATTCAAAT TATTTCCAAA TATTGCACCA AAAACAGTTG AAAATTTTGT 8220 

GACACATGCA AAAAATGGTT ATTATGATGG AATCACATTC CACCGTGTCA TTAATGACTT 8280 

^ CATGATTCAA GGTGGCGATC CAACAGCTAC TGGTATGGGT GGCGAAAGTA TTTATGGCGG 834 0 

TGCTTTTGAA GATGAATTTT CATTAAATGC ATTTAACTTA TATGGCGCAT TATCAATGGC 84 00 

TAACTCAGGA CCTAATACTA ATGGTTCACA ATTTTTCATT GTTCAAATGA AAGAAGTACC 8460 

35 

TCAAAATATG TTAAGTCAAC TTGCAGATGG TGGCTGGCCT CAACCAATCG TTGATGCATA 8520 

TGGCSAAAAG GGTGGTACAC CATGGTTAGA TCAAAAACAT ACAGTATTCG GTCAAATCAT 8580 

TGATGGTGAA aCTACATTAG AAGATATTGC AAATACAAAA GTGGGACCAC AAGATAAACC 8640 

40 

ACTTCATGAT GTTGTAATTG AATCTATTGA TGTTGAAGAA TAATATCTAA ACATAATTAA 8700 

CTACCAACAT TTTAAACTCG GATAAAGCTA ATTTATGAAT GGATTAGTAT ATATTCCAAC 8760 

^ gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 8820 

TTAAATAGAT GTGTATTTTA AAGATAATAG TTGAGGTTGC TTTTTATGTT TTTACAGAGA 8880 

ATTGCTATTC AAATAGTAAA TAAATTGAAA ACAAAGTAGC TGGATATCAT ATTGATTTAG 8940 

SO ATAGGAATTT GTTGCTAATT TTATTTGTAA ATCCAAGTTT GTAGAATTCT TATTCATTTA 9000 

TAAAATAATA TTCGTATGAT TTGATTTTTT AATTAGTCCA CCATTTCGAT TTGTGCTATG 9060 
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AACATATCAA GGTGCGTGTA CTGGTATTCA ACCATACXXST GCX^TTTGTTG AGACCCCTAA 9180 

TCATACTGAA GGACTGATTC ATATATCAGA AATTATGGAT GACTACGTTC ATAATTTGAA 9240 

GAAATTTCTA TCAGAAGGCC AAATTGTTAA AGCTAAAATT TTGTCTATAG ATGATGAAGG 9300 

AAAGCTTAAT CTATCATTAA AGGATAATGA TTACTTCAAA AATTATGAGC GTAAGAAGGA 9360 

AAAACAATCA GTATTAGATG AAATCAGAGA AACAGAAAAA TATGGGTTTC AAACACTTAA 9420 

AGAACGCTTA CCAATCTGGA TAAAACAGTC AAAGCGAGCA ATTCGAAACG ACTAAAGGAA 9480 

CAGATAAATC GTACCGAAAA TCATACAAAG GGTCTGAAAT GAAAGTTTCT TAGACTATAA 9540 

AAGAGATTAG TATCTATTAA ATTTTATTAG ATACTAATCT CTTTTTGTCT ACGATAACGT 9600 

AATATGaTTG ATTCTATTTA CACGTACAAA TGGTTTAAGG TGACATATCC ATTATCTTTG 9660 

TTAGATAGAA TCGTTGATTT GCaATATTGT ATGTGGATTT GTTTTTT T TA TTTATTTTAG 9720 

AAATGAGAAC TACAACTTAA AGTATTAAAC GAATTGCAAC TATATAAACA GATAATTGGA 9780 

GAATGAAAAA ATTACATGTT ATAGTCAACT CAATAATTTT AAGGAGGAAT TAAGTAATGA 9840 

AAAGTAAATA CGAACCATTG TTTGATAAAG TAGAATTACC AAATGGAGTA GAGTTGAGAA 9900 

ATCGATTTGT GTTAGCCCCT TTAACACATA TTTCTTCAAA TGATGATGGT ACTATTTCAG 9960 

ATGTAGAACT TCCTTATATT GAAAAGCGTT CACAAGATGT TGGTATTACA ATTAATGCTG 10020 

CGAGTAATGT GAGTGATGTC GGAAAAGCAT TTCCAGGACA GCCATCAATC GCGCATGACA 10080 

GTAATATTGA AGGACTAAAA CGATTAGCTA CAGCAATGAA GAAAAACGGT GCCAAAGCAC 10140 

TCGTACAAAT ACATCATGGC GGTGCACAAG CATTGCCTGA ATTAACACCT GATGGAGACG 10200 

TCGTAGCACC AAGTCCAATT TCTTTAAAAA GTTTTGGTCA GAAACAAGAA CATAGTGCTA 10260 

GAGAAATGAC GAATGAAGAG ATTGAACAAG CAATCAAGGA TTTTGGTGAA GCAACGCGAC 10320 

GTGCAATTGA AGCAGGGTTT 6ATGGTGTTG AAATACATGG CGCX3AATCAT TACTTAATTC 10380 

ATCAATTTGT ATCACCATAC TATAATAGAA GAAATGATGT ATGGGCAAAT CAATATAAAT 10440 

TCCCGGTCGC TGTGATTGAA GAAGTACTTA AAGCGAAAGA AGCGTATGGC AATAAAGACT 10500 

TTATAGTTGG ATACAGATTA TCTCCAGAGG AAGCGGAGTC TCCAGGAATC ACAATGGAAA 10560 

TTACAGAGGA ACTCGTTAAT AAAATTAGCC ATATGCCAAT CGACTATATT CATGTTTCAA 10620 

TGATGGATAC GCATGCAACG ACACGTGAAG GTAAATACGC TGGACAAGAA AGACTGCCTT 10680 

TAATTCACAA ATGGATAAAT GGTCGTATGC CACTTATCGG TATTGGTTCA ATTTTCACAG 10740 

CTGACGAAGC TTTAGATGCA GTTGAAAATG TTGGTGTTGA CTTAGTAGCC ATTGGTAGAG 10800 

AGCTACTACT GGATTATCAA TTTGTTGAAA AAATTAAAGA TGGACGGGAA GATGAAATTA 10860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10953 
<2) INFORMATION FOR SEQ ID NO: 63: 

^ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 60 

75 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAAT6 TTAAACAGAA AGGTAGTTTA 120 
GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 180 

20 AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 240 
TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 300 
CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 360 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAG6TGT CGCAAACCAA GAACTAACAC 420 
CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 480 
AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 540 



CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTGCAGAAGT GATGCGATTA GGTATTATTT 600 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 660 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 720 

35 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 780 

TACOAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGOCACAAA 840 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 900 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 960 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 1020 

45 CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 1080 

CATTTGACGG CGATG6AGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 1140 

GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 1200 

50 ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 1260 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGC6TCGCG 1320 
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CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 1440 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 1500 

TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 1560 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 1620 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 1680 

TGATGTGGTT CAAQATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 1740 

TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATACACTGG TACTCAATAT 1860 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGT6 AAATCATTTA ACAGCGA6AA TAATCAACCA 1980 

GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 2040 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 2100 

6TCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 2160 

2S TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 2280 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 2400 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 2520 

AAATGCTAGC AATACTGGTG AAGGCAGTAT TAATACGACA TTAACATTTG ATGATCCTGC 2580 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 2640 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 2760 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2820 

AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2880 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2940 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3000 

SO AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3060 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 3120 
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TAATACTGAA ATCGGTAACA ATGGTAATTT TGGTGCTTCA TTAAAAGCAG ATCAATTTAA 3240 

ATATGAAGTA ACATTACCAC AAGGTGTAAC TTACGTTAAT AATTCATTAA CTACAACATT 3300 

5 CCCTAATGGT AATGAAGACA GTACAGTATT GAAAAATATG ACTGTTAATT ATGATCAAAA . 3360 

TGCAAATAAA GTTACATTTA CAAGCCAAGG TGTGACAACG GCACGTGGTA CACACACTAA 3420 

AGAAGTTTTA TTCCCAGATA AATCTTTAAA ATTATCATAT AAAGTTAATG TTGCGAATAT 3480 

10 

CGATACACCT AAAAATATTG ATTTTAATGA AAAATTAACA TATCGTACTG CTTCAGATGT 3540 

TGTAATTAAT AATGCGCAAC CAGAAGTaCA CTAACTGCAG ATCCATTTTC AGTAGCGGTT 3600 

GAAATGAACA AAGATGCGTT GCAACAACAA GTAAACTCAC AAGTTGATAA TAGTCATTAC 3660 

IS 

ACAACAGCAT CAATTGCAGA ATACAATAAA CTTAAACAAC AAGCAGATAC TATTTTAAAT 3720 

GAAGATGCGA ATCATGTTAA AACTGCAAAT CGTGCATCTC AAGCGGATAT TGATGGTTTA 37B0 

GTAACTAAAT TACAAGCTGC ATTAATTGAT AATCAAGCAG CAATTGCTGA ATTAGATACT 3840 

AAAGCTCAAG AAAAGGTTAC AGCAGCACAA CAAAGTAAAA AAGTTACGCA AGATGAAGTT 3900 

GCAGCACTTG TAACTAAAAT TAACAATGAT AAAAATAATG CAATCGCAGA AATTAATAAA 3960 

2s CAAACTACAG CACAAGGTGT CACAACTGAA AAAGATAATG GTATCGCAGT GTTAGAACAA 4020 

GATGTGATTA CACCAACAGT TAAACCTCAA GCGAAACAAG ATATTATCCA AGCAGTTACA 4080 

ACTCGTAAAC AACAAATTAA AAAGTCAAAT GCATCATTAC AAGATGAAAA AGATGTAGCA 4140 

30 AATGATAAAA TTGGTAAAAT TGAAACAAAG GCAATTAAAG ATATTGATGC AGCAACAACA 4200 

AATGCACAAG TAGAAGCCAT TAAAACAAAA GCAATCAATG ATATTAATCA AACTACACCT 4260 

GCTACAACAG CTAAAGCAGC AGCTCTTGAA GAATTTGACG AAGTTGTTCA AGCACAAATT 4320 

^ GATCAAGCAC CTTTAAATCC TGATACAACA AATGAAGAAG TAGCGGAAgC TATTGAACGT 4380 

ATTAATGCAG CTAAAGTTTC TGGTGTTAAA GCAATTGAAG CGACAACGAC TGCACAAGAT 4440 

TTAGAAAGAG TTAAAAACGA AGAAATCTCA AAAATTGAAA ATATTACTGA CTCTACGCAA 4500 

40 

ACAAAAATGG ATGCCTATAA TGAAGTTAAA CAAGCTGCAA CAGCTAGAAA AGCTCAAAAT 4560 

GCTACAGTTT CAAATGCAAC AAATGAAGAA GTAGCAGAAG CTGATGCAGC AGTAGATGCA 4620 

GCTCAAAAGC AAGGTTTACA TGACATCCAA GTTGTTAAAT CAAAACAGGA AGTTGCTGAT 4680 

45 

ACAAAATCAA AAGTATTAGA TAAAATCAAT GCAATTCAAA CACAAGCAAA AGTTAAACCT 4740 

GCAGCTGATA CGGAAGTAGA AAAOSCATAT AATACACGTA AACAAGAAAT TCAAAATAGC 4800 

so AAT6CTTCAA CTACAGAAGA AAAACAAGCT GCATATACAG AATTAGATAC TAAAAAGCAA 4860 

GAAGCAAGAA CAAATCTTGA TGCTGCAAAT ACAAACAGTG ATGTAACAAC AGCTAAAGAC 4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 

ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT TACTGCAAAC 5100 

5 GCT6ATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5 ISO 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATG1TAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCA6ATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA ATAACGGCTC AACAACTGAA 5280 

10 

GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA 534 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 5460 

IS 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGCAAGACA TTACTGCTGA AGAAATTGCA 5520 

6CGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACX3ACAGGTG AAAATAGTAT TGATCAAGTA 5640 

20 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC 5700 

AAATTGCAAG AGATTCAAGC tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5760 

25 GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAC 5820 

GCACAAGTTG ATGAAGCTAA AGCAAATGCA GAAGCAGCGA TTAATGCGGT AACACCAAAA 58 80 

GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT 5940 

30 GTTATCAATA ATGATCAGAA CGCTACAACA GAAGAAAAAG AAGCAGCTAT TCAACAATTA 6000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC AACAGCGGTT 6120 

^ AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT 61 BO 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 6300 

40 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6360 

GCX3AAAGATG AATTAGCAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACX3CACA ATTAACACAA 6480 

45 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 6540 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 6600 

SO GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6720 
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AAAGTTCAAC AACTTCATGC AAATCCTGTT AAGAAACCAG CAGGTAAAAA AGAATTAGAT 6B40 

CAAGCTGCAG CTGATAAGAA AACACAAATA GAACAAACAC CAAATGCATC ACAACAAGAA 6900 

5 ATTAATGATG CAAAACAAGA AGTTGATACT GAATTAAATC AAGCGAAAAC AAATGTCGAT 6960 

CAATCATCAA CAAATGAATA TGTTGATAAT GCAGTTAAAG AAGGAAAAGC TAAAATTAAT 7020 

GCAGTTAAAA CATTTAGTGA GTACAAAAAA GATGCTTTAG CTAAAATTGA AGATGCATAT 7080 

AATGCTAAAG TAAACGAAGC GGATAACTCT AACGCATCGA CTTCAAGTGA AATTGCTGAA 714 0 

GCGAAACAAA AACTTGCTGA ATTAAAACAA ACTGCGGATC AAAATGTTAA TCAAGCTACT 7200 

TCTAAAGATG ACATTGAAGT TCAAATTCAT AATGACTTAG ATAATATTAA CGATTACACA 7260 

IS 

ATTCCAACAG GTAAAAAAGA ATCAGCTACA ACAGATTTAT ATGCTTATGC AGATCAGAAG 7320 

AAAAATAATA TTTCAGCTGA CACTAATGCA ACACAAGATG AAAAGCAACA AGCAATTAAG 73B0 

CAAGTTGACC AAAATGTTCA AACTGCATTA GAAAGCATTA ATAATGGTGT GGATAATGGT 7440 

20 

GACGTTGATG ATGCATTAAC ACAAGGTAAA GCAGCAATTG ATGCTATTCA AGTAGATGCT 7500 

ACTGTTAAAC CTAAAGCGAA CCAAGCTATT GAAGTTAAAG CAGAAGATAC GAAAGAATCT 7560 

25 ATTGATCAAA GTGACCAGTT AACTGCTGAA GAAAAAACTG AAGCATTAGC AATGATTAAA 7620 

CAAATTACAG ATCAAGCTAA ACAAGGTATT ACTGATGCAA CAACAACTGC TGAAGTTGAA 7680 

AAAGCGAAAg cTCaAGGACT TGAAGCATTT GATAACATTC AAATCGACTC AACAGAAAAA 7740 

30 CAAAAAGCTA TCGAAGAATT AGAAACTGCA CTAGACCAGA TTGAAGCAGG TGTAAATGTC 7800 

AACGCTGATG CTACAACTGA AGAAAAAGAA GCGTTTACGA ATGCTTTAGA AGACATTTTA 7860 

TCAAAAGCAA CTGaAGATAT TTCTGATCAA ACTACAAATG CAGAAATCGC TACTGTCAAA 7920 

^ AATAGTGCGC TTGAACT^CT TAAAGCACAA CGTATTAATC CTGAAGTTAA GAAAAATGCT 7980 

TTGGAAGCAA TCAGAGAAGT GGTTAACAAG CAAATAGGAA tAATTAAAAA TGCAGATGCA 8040 

GATGCATCGG CGGAAAGAnA TTGCACGTAC GGGATTTAGG TAGATATTTT GGACCGATTT 8100 

40 

GCTGGATAAA TTTAGGGTnA AACCCCAACC AATGCCGAAG TTGCCTGAAT TACCA 8155 
(2) INFORMATION FOR SSQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

5 AGTTATCX3AA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 240 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TCATCTTTAT 360 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 420 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 540 

75 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA OGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATAC6CAA GACAAAGTCT TGCGTATCGA TAGA6TCCGT 720 

20 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 780 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 840 

25 CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 1020 

30 GAAACTTCAC TAGGCXjCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 1140 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 1200 

TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 1260 

TCAXSAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 1320 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 1380 

40 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CATCATATTC TAATAATGGC 1440 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 1500 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

45 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 1630 

so (2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 60 

10 CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 120 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 180 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 240 

CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 3 00 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 360 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 420 

20 

CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 480 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 540 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCATCTGAT 600 

25 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 660 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 720 

3^ TACCACAACC CG 732 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: dcyuble 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



45 



50 



AATATATTCA TATGTTTCAT 


CAACAATATT 


AGCTGCTTTT TGAATTAAAG CAATTTCGTC 


60 


AGCATCTTTG ACGTCTCTAA 


TTTTATCTAC 


AGTATTAGAA ATGCTTATTA ATGATATACG 


120 


GCTTTTATTT AATTCAAGGT 


ATGTATCATA 


ACTTACATGA TGCCCCTCAA AACCTACATT 


180 


TTCAAAATTT TCTTGGTGTA 


GCAATTCTTT 


AATCTCACCA ATAATAGTAG ATTTACGATT 


240 


AATAATTTCA TAATTTGGCG 


CCTGCTTAGT 


TGCTTGATCA ATATATCTAA AGTCTGTTAT 


300 


CAAATATTGT TTATCTTTAG ATATGATAAG 


TGCTCCACTG GTACCAGTAA AACCTGATAA 


360 


ATATCTTCTA TTGTAATCCG 


AAAGAATGaT 


AATCGCATCT AAATGTTTTT GTTCTTU^AAT 


420 
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CAACTTTATA CATTAAAATA ATATCATAAT AAGGATAAAA AATAATAGAT ATTGATTTTA 


540 




GGGAGATAGT AATGAAAAAA TTGGTTTCAA TTGTTGGCGC AACATTATTG TTAGCTGGAT 


600 


5 


GTGGATCACA AAATTTAGCA 


CCATTAGAAG AnAAAACAAC AGATTTAAGA GAAGATAATC 


660 




ATCAACTCAA ACTAGATATT 


CAAGAACTTA ATCAACAAAT TAGTGATTCT 


AAATCTAAAA 


720 


10 


TTAAAGGGCT TGAAAAGGAT 


AAAGAAAACA GTAAAAAAAC TGCATCTAAT 


AATACGAAAA 


780 


TTAAATTGAT GAATGTTACA 


TCAACATACT ACGACAAAGT TGCTAAAGCT 


TTGAAATCCT 


840 




ATAACGATAT TGAGAAAGAT 


GTAAGTAAAA ACAAAGGCGA TAAGAATGTT 


CAATCX5AAAT 


900 


IS 


TAAATCAAAT TTCTAATGAT ATTCAAAGTG CTCACACTTC ATACAAAGAT 


GCTATCGATG 


960 




GTTTATCACT TAGTGATGAT 


GATAAAAAAA CTGTCTAAAAA TATCGATAAA 


TTAAACTCTG 


1020 




ATTTGAATCA TGCATTTGAT GATATTAAAA ATGGCTATCA AAATAAAGAT AAAAAACAAC 


1080 


20 


TTACAAAAGG ACAACAAGCG 


TTGTCAAAAT TAAACTTAAA TGCAAAATCA 


TGATAGGAGT 


1140 




CTTTTAATGC GTAATATAAT 


ATTTTATCTT GTACTTATTA TTGCTGCGAT 


TGGATTAGTA 


1200 




ATCAATCTAG ATGCCTTTAT 


TTTTTCAATC GTCAGAATGT TAATCAGCTT TGcgTAaTAG 


1260 


25 


CTGCTATTAT TTATCTGATT 


TATTATTTCT TCATCTTAAC TGAAGACCAA 


CGCAAATATC 


1320 




GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT CAAAGAAGAA AATAGATAAA AAAACGGAAG 


1380 




CACTTGTAGG TAAAATAGTC 


TACGTGCTTC CATTTTTTAT TCTAAAAACT 


ACTTTCTAAA 


1440 


30 


CATCCATTCA TCTGAACGAT 


ATTTTTCAGT TAATTCTTCC ACTTCTGCCA 


ATTGAGCTTC 


1500 




TGtTAATTCA AGTGGCTTTA 


ATTCTATATT TAAACCTTTC TTAAAACCTT 


TCTCGAAAGC 


1560 


35 


TTCTTCCATT TGACTAATAG 


TAATGTGTTC ATCTGAAATA TCATTGATGG 


CAACTGCTTT 


1620 


TTCAACX3AAT GCCTCTTTCA 


TTTTTAATTT TAATCTTTCA TTTTTATAAA 


TrAACATATC 


1680 




AAACAGTTCA TCAATATCAA 


TATCTTGTAA AATCQAACCG TGTTGGAGGA 


TTACGCCCTT 


1740 


40 


TTGTCTCGTT TGAGCACTCC 


CAGCAATCTT ACGGCCTTCA ACAACTAGCT 


CATACCAACT 


1800 




TGGTGCATCA AAACACACTG 


AACTTCGAGG TTGTTTTAAT TTTTGACGCT 


CTTCAGGCGT 


1860 




TTTAGGTACC GCAAAATAAG 


TATCAAATCC TAAGTTTTTA AATCCTTCTA 


ATAATCCTTG 


1920 


45 


TGAAATCACT CTGTACGCTT 


CTGTAACTGT AGAAGGCATA TTCGGATGCG ATTCAGGCAC 


1980 




AATCACACTG TAAGTTAACT 


CTTTATCATG TAGCACCCCA CGGCCACCAG 


TTTGACGCCT 


2040 




TACX3AGACCA AAACCTTTCT 


CTTTAACCTT ATCAATATCA ATTTCTTTTT 


GTAGCCTTTG 


2100 


SO 


GAAATACCCT ATTGATAATG 


TTGCAGGATT CCATGTGTAA AAACGTATAA 


CTGGATCAAT 


2160 




TTCACCTCTA GAGACAAAAT 


TTAATAACGC TTCATCCATT GCCATATTAT 


AATATGGGTC 


2220 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2340 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 2400 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTGCTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 520 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAAT6T TCAGGCAACG 2640 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 2700 

CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 2760 

CG6CTATAAA AAATGGACTG GAAAAATAAA 6TCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 2940 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAAT6 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 3120 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 3240 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 3300 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 3360 

TCAATCATCA TACCTTCTTC AACATTTAAT GGGAAGTATA TTGTTGGTGG ATGTACACCG 3420 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 3480 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCT^ GTGTTTAGAT 3540 

40 AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CA6AAACCTC TTTAAGTCCA 3600 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 3 840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

SO 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4020 

SS 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT 


GGaTTOGTAC 


CATGCGCAGA 


ATCTGGCACA 


4140 




ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 


4200 


5 


GCAGTCCATT CACCATGTGC GCCAGCAGCT 


GGTTGTAATG 


TCACCTCATC 


CATACCAGTA 


4260 




ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 


4320 




TCATCTTGTA ATGGATGTGA TTCACTAAAT 


CCTGGTATTC 


TAGCAACCTT 


TTCATTAATT 


4380 


in 


TTAGGGTTAT ACTTCATCGT ACATGAACCC 


AATGGATAAA 


ATCCGTTGTC 


TACACCGAAA 


4440 




TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 


4500 


IS 


TCCGCTTTGT TTTTACGAAT AAATTTATCA 


TCTAACAATG 


ACTCAACAGA 


ATTTGTTTTA 


4560 


ATATCACTTT TTGGTAATGA ATATGCATAT 


CTGCCTTCAC 


GAGATCTTTC AAAAATTAAT 


4620 




GGACTTGATT TACTAGTCAT TTAACTCACC 


AGCCTTTTCT 


ACAAATGTAT 


CGATTTCATC 


4680 


20 


TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTOGT CTGAAACAAC 


4740 




ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 


4800 




TGGTTTGTCA AATTTGACTA CAAACTCATT 


GnmAAGnTGT 


ACCATCTAAT 


ACTTCAAAAC 


4860 


25 


crrmTAAT aaattgttgt ttagcatagt 


TAGCATGTTC 


TATATTTTGA ACTGCAATAT 


4920 




CATAGATACC TTGTTTACCA AGT6CTGACA 


TTGCAATTGA 


TGaCGcTAAA 


GCATTTAATG 


4980 




CTTGGTTAGA ACAAATATTA GATGTCGCTT 


TATCGCGTCG 


AATATGTTGT 


TCACGTGCTT 


5040 


30 


GTAATGTTAA TACAAAGCCA CGATTACCTT 


CATCATCTTG 


TGTTTGACOG 


ACTAATCTAC 


5100 




CTGGCACTTT ACGCATTAAC TTTTTCGTCG 


TTGCAAAATA 


TCCACAATGT 


GGCCCACCGA 


5150 


35 


ATTQAGCAGG AATTCCGAAT ggctgagtat 


CACCTACAAC 


AATATCTGCA 


CCAAATGAAC 


5220 


ctggaggtgt aagtaatccc aatgctaatg 


GATTTGCATA 


TACGATAAAT 


AATGCTTTTT 


5280 




TATC&CAAT AAAGCTATGA ATCTTTTCAA 


GATCTTCAAT 


TGAACCGTAA 


AAGTTTGGAT 


5340 


40 


ATTGTACTGC AACAGCTGCT GTTTCATCAT 


CCACTGCTGC 


TTCTAATTTT 


TTCAAATCTG 


5400 




TAACAGTGCC ATCTAAATCG ATTTCCACTA 


CTTCGAATTC 


CTTACGCGTC 


TTAGCATAAG 


5460 




TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 


5520 


45 


TTGTTTGACT AAATGCTAAG ATACATGCTT 


CAGCAAAGCT AGTCATCCCA TCATACATAG 


5580 




AAGAATTTGC TACATCCATA TCTGTTAATT 


CACAAATTAA AGTTTGGAAC TCAAAAATGG 


5640 




CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATG6CGT ATATGCTGTG TAAAATTCTG 


5700 


50 


ATCTTGAAAT CATAGCATCC ACAACTGATG 


GCGCGTAATG 


ATCATAAACA 


CCAGCACCCA 


5760 




rAAATGATGT ATGCGTTTCT TTAGTGATAT 


tCTTGCTkGC AATGGGGATT 


TAAACnTCTA 


5820 



55 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1B355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 





ATtlATAATTG GCTTTGCTAA TAATTACTTC 


CCTGAATTAC aAGTATTAGC 


AAACGAAATA 


60 


75 


AAATCTGATA TGGCTAGTTC ATTAAAACAA TGATATTTTT ATTTAAATTT TTaAAGCTTT 


120 




GTAC6AAATT 6TACAAAGCT 


TTTTTGGTGC 


GTATTGTATG GGCAACAACT 


TGACGATGAA 


180 




AATCCGTTAC AGGATTGGTA ATAGGAAATG TTAGCGAAAG ACAAGGGTAT CCATTGTAGA 


240 


20 


TTAACAAAAG GACGTTTCCA 


CAAGTGTGGG 


TTATTCTCAC TAAAGCAATA 


CGCAGAGACA 


300 




ACTTACGTAA AATTTTGAAC 


TGACTAGAAC 


GGAACTTCTA CTCAATTATT 


GATAAAAATT 


360 




TTCAAAAAGA CTTGAATGTG 


CTGAGAATAC 


GAAGTTTATG GAAGGATTAT 


CAAAATATAA 


420 


25 


ATGTGCATTC ATTTACAACC 


TTTATTGACA 


ATGATTCTCA ACTAATATAG 


TATATAATCA 


4B0 




AATCGTAATA GTTACGATTT 


GTTTTCTGCA 


ACTTTTTTGA AGTTTTAGTT 


GAGGTGAAAA 


540 


30 


CAATAAAAGC ATCTAAGTGA 


ATGTAGTTAA 


CGGACAACTG CATTCGCTTG 


TAGAGCCACA 


600 


AGAAGCAACT TTAAATAAGG 


TTTACGGTTG 


CATTTTGATA CAACAACCGA 


TTACTAAGTC 


660 




ATGCTTTCCA CTTTGCGGGT 


TAGCATGACT 


TACCTAATAG ATAGAGCTAT 


TAGGTTCAGC 


720 


35 


TTCTAAAAAA TTACAGTTTT 


AGAGGAATAC 


AGTTGcTTGc tTCGCAACAA 


CTGCATAAGA 


780 


GCCATGGTTT TCGCTTTTGC 


GAATTA6CAT 


GACTTACCTA CTAGATAGAG 


CTATTAGGTT 


840 




CATCTTCTAA AAAATTACAG 


GTTTAGAGGA ATACAGTTGT TTGcTTCGCA ACAACTGCAT 


900 


40 


AAGAGCCTCT AGTAATTAAA ATTACAGAGG 


CTCTAAAAAT ACATCTAAAG 


GAGTGTCGTA 


960 




TGAATCGGCA GGTTATAGAA TTTTCTAAGT ATAATCCTTC GGGGAATATG ACGATACTTG 


1020 




TTCATTCAAA ACATGATGCT AGTGAATATG 


CATCTATCGC CAATCAGTTG 


ATGGCCGCAA 


1080 


45 


CACATGTATG CTGTGAACAG 


GTAGGCTTTA TAGrATCAAC ACAAAATGAT 


GATGGTAATG 


1140 




ATTTTCACTT AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA TGCGACGATG 


TCATATATAC 


1200 




ATCATTTGCA GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT TAAGGTGAAG 


GTGTCTGGCT 


1260 


50 


GTTCGGATTT AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA CTATGAAGTT 


CAAATGCCAC 


1320 




AAGCCCATCG TGTTGTGCCA ACAACAATTA ATATGGGTAA TCATTCATGG 


AAAGCAATAG 


1380 



55 
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TTATGTGCAC CCACCACTAT TTATGAATGA 
AGATGTACCG GTTTATGTGT ATAAGTTATT 
^ CCGTGAAATG CGTTTAATGT GGAAGGAAAT 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA 
GGATGAAGGT GATATTGAGC ATTTCGAAAT 

10 

TTATGTAAGA TATACCGCAA TCCTCATTGA 
TTACTTTGAT TTTTCAGCTG TACCATTTAA 
TCAAATTCCA AGAATGCCAA GTGAAGATTA 

IS 

GAAAATGCTA GGTATCAAAA CGCCAATGAT 
TTGCCAGGCG TACAAGGATA TX?CATCAAGA 
20 TCTATTTGAA GGAGATAAAG CACTCGTCAC 

ATAATAAGG6 TTTGAAGTTT TATAATAGAA 
ATAAAAATAA GCAAATAATT GAGAAAAATA 
TATCAATTTA GAAAGAGGAA AAGCAAATGA 
TTGCATCAGG GCTAATTTTA ACTGGTTGTG 
AAAACAAGCA ATTAACGTAT ACGACGGTTA 

30 

ACGGTGGATC AATGTCTGCT GAAAGTATGA 
ATGGTATTAA GCCTTTACTA GCTAAAAAGT 
CGTTCCATTT GAGAGATGAC GTTAAATTCC 

35 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA 
TCGACATTAA TTGACAATGT TAAAGTTAAA 

4^ GAAGCATATC aacctgcatt ggctgaatta 

CCAAAAGACT TTaAAAACGG TACAAcAAAA 
CCATTTAAAT TAGGTGAACA CAAAAAAGAT 
45 TACTGGGGCG AAAAGTCTAA ACTTAACAAA 

ACAGCATTCC TATCAATGAA AAAAGGTGAA 

ACAGATAGCT TAGACAAAGA CTCTTTAAAA 

SO 

AAGCGTAGTC AACCTATGAA TACGAAAATG 
GCTGTGAGTG ACAAAACAGT CAGACAAGCG 

55 



CTTTTCATTG AAAGCCATTT TCGAAGGAAC 3300 

TCCTGAAGGA CCGATAACGA TGACACTAAT 3360 

GATGGTTATT TTACAAGCAT TTAGAGTGCC 3420 

GGAAAATTAT CCAGTACGTC CTGAAACTTT 34 80 

CTTGCCAGAT ATCTTACAAG AATATCTGCT 3540 

TCCATTTTCA CAGGCAGACG AAAACGGACA 3600 

GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3660 

TTACAGAACG GCGATGATTC AGCATATTGG 3720 

TGATCAGTTC CTAACTCGCT ATGAAGCAA6 3780 

TCAACACTTA TCTTCTCAAT TTAATACAAA 3840 

AAAATTTTTG GAAATCAATA GAACGCTTTC 3900 

AAAAATTATT GAATTATGTT TGACATTTAC 3960 

ATCATTACGA TTTGATTAAG TAATGCAACT 4020 

GAAAACTAAC TAAAATGAGT GCAATGTTAC 4080 

GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 414 0 

AAGATATCGG TGATATGAAT CCGCATGTTT 4200 

TATACGAGCC GCTTGTACGT AACACGAAAG 4260 

GGGATGTGTC TGAAGATGGG AAGACATACA 4320 

ATGATGGTAC GCCATTTGca TGCtGACGCA 4380 

AACAAAAAAT TGCATTCTTG GTTAAAGATT 4440 

GATAAGTACA CGGTTGAATT GAATTTGAAA 4500 

GCGATGCCTC GTCCATATGT ATTTGTGTCT 4560 

GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4620 

GAGTCTGCAG ACTTTAACAA AAATGATCAA 4680 

GTACAAGCAA AA6TAATGCC TGCTGGTGAA 4740 

ACGAACTTTG CCTTCACAGA TGATAGAGGT 4800 

CAATTGAAAG ATACAGGTGA CTATCAAGTT 4860 

TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 492 0 

ATTGGTCATA TGGTAAACAG AGATAAAATT 4980 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 

TTAGATGAAG CTG6TTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 5160 

^ AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA AC/U^GCAGAA 5220 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 5280 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 5340 

10 

GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 5400 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAGCATTGA TGACGCATTT 5460 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA ACAAATTGAT 5520 

IS 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 5580 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 5640 

2^ AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5820 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5880 

6AAGCGATGC AATTTAATTT TQGTACAAGC TACATTACAG GTGACCCAGT TGCTGAACGT 5940 

ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG 6000 

ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT 6060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 6120 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ATATTGCCGA CTTCTGGATT AACAGGTCCA 6180 

35 

GAAAGTTACA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA 6240 

AATGTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 6300 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTT6CG 6360 

40 

6TATCAATCT TTTGTATGTC TATACCAATG ATAATGGGTG GACTAGTTGT TATCGAGTAT 6420 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 6430 

45 CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6540 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6600 

TAAAmCGATT ATTttlCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 6660 

SO TCTTTTTAGG ATTAGCAGCA CCACTTGTGA CATTTTATGA TCCTAACCAT ATCX3ATACAG 6720 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6780 

55 



495 



10 



IS 



20 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6900 

TTGTTGACGC CTTAATCATG CX3TGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6960 

TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATATTATC ATGGCATTTA 7020 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7080 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 7140 

AACATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 7200 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA 7320 

TGTTTGCGCC AGGTATTGCC ATA6TGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCT6TGAAAA 7440 

AAGGA6TGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

GACAGATCAA CCACTOGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

CXnTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 7680 

ATCTGAATCG CAATTGAAAA AQTACCGTGG TAAAQACATT GCGATGGTCA TGCAACAAGG 7740 

TAGTC6TGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7800 

ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 7860 

AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACCCT TACATGTTAT CAGGAGGAAT 7920 

GTTACAGCGA TTGATGATTG CTTTAGCGTT AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7980 

TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 8040 

TATT^AAAAA CACTTTGACT GTGCGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCX3CACATG TTTTTAAGCG TCGTCGAACA CCTATCOTGA 8340 

AAGGTGTGTC ATTTGAGTGT CCAATOGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 8460 

50 TAACCTTAAA TGATCAACCG ATGCATAAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8520 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 8580 



55 



35 
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TGTTGGAAGA AGTCGGTCTA 


TCTAAGGCAT 


ACATGGATAA 


ATATCCTAAT 


ATGTTATCAG 


8700 




GTGGAGAGGC 


GCAACGTGTT 


GCGATTGCGC 


GTGCAATATG 


TATTAACCCT 


AAATATATTT 


8760 


5 


TGTTTGATGA 


AGCCATTAGT 


TCACTCGACA 


TGTCAATTCA 


AACACAAATA 


TTAGATTTAT 


8820 




TGATTCATTT 


ACGTGAAACG 


CGTCAGTTGA 


GTTATATTTT 


TATCACACAT 


GATATTCAAG 


8880 


10 


CTGCCACGTA 


TTTATGTGAT 


CAATTAATTA 


TTTTTAAAAA 


CGGAAAAATA 


GAAGAACAAA 


8940 


TTCCGACAAG 


CGCATTGCAT 


AAAAGTGACA 


ATGCTTATAC 


AAGAGAATTA 


ATAGAAAAAC 


9000 




AACTATCATT 


CTAAGGAGTG 


AGATAATGAA 


AGGTGCAATG 


GCTTGGCCCT 


TTTTGAGATT 


9060 


IS 


ATATATATTA ACATTOATGT 


TCTTTAGTGC 


CAATGCAATC 


TTAAACGTGT 


TTATACCTTT 


9120 




ACGAGGGCAT 


GATTTAGGC6 


CAACXSAATAC 


GGTTATCGGT ATCGTTATGG 


GGGCATACAT 


9180 




GTTAACAGCA ATGGTATTTC 


GACCATGGGC 


AGGACAAATT ATTGCTCGTG 


TCGGTCCCAT 


9240 


20 


TAAAGTATTA AGAATTATTT 


TGATTATCAA 


TGCCATAGCT 


TTAATTATTT 


ATGGTTTTAC 


9300 




TGGCTTAGAA 


GGTTATTTCG 


TAGCACGTGT 


TATGCAAGGT 


GTGTGTACGG 


CATTCTTTTC 


9360 




TATGTCTTTA 


CAGCTAGGTA 


TTATTGATGC 


ATTACCAGAG 


GAACATCGTT 


CTGAAGGTGT 


9420 


2$ 


ATCATTGTAC 


TCGCTATTTT 


CAACXATTCC 


AAACTTAATC 


GGACCATTAG 


TTGCCGTAGG 


9480 




TATTTGGAAT 


GCAAATAATA 


TTTCACTATT 


TGCAATTGTC 


ATTATCTTTA 


TCGCATTAAC 


9540 




AACAACATTC 


TTTGsTATCG 


CGTGACCTTT 


GCTGAACAGG 


AACCCGATAC 


GTCAGATAAG 


9600 


30 


ATTGAAAAAA 


TGCCGTTTAA 


CGCTGTAACT 


GTTTTTGCGC 


AATTTTTCAA 


AAATAAAGAG 


9660 




TTGTTGAACA 


GTGGTATTAT 


CATGATTGTT 


GCATCGATTG 


TATTTGGTGC 


AGTTAGTACA 


9720 


35 


TTTGTACCGT 


TATACACAGT 


GAGTTTAGGA 


TTCGCGAATG 


CGGGAATCTT 


TTTGACAATA 


9780 


CAGGCCATCG 


CAGTTGTTGC 


GGCAAGATTT 


TACTTAAGGA AATACATTCC 


GTCAGATGGT 


9840 




ATG-ffiGCATC 


CTAAATATAT 


GGTATCTGTA 


CTATCATTAT 


TAGTAATCGC 


GTCATTTGTA 


9900 


40 


GTGdCATTTG GTCCGCAAGT AGGTGCAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 


9960 


ATGACGCAAG 


CAATGGTGTA 


CCCAACATTA ACATCATACT 


TAAGCTTCGT 


CTTACCAAAA 


10020 




GTAGGTCGTA ATATGTTGTT 


AGGTTTATTT ATTGCCTGTG 


CAGACTTAGG 


TATATCGTTA 


10080 


45 


GGTGGCGCAT 


TGATGGGACC 


TATTTCCGAT 


TTAGTAGGAT 


TTAAATGGAT 


GTATCTAATT 


10140 




TGTGGTATGT 


TAGTCATTGT 


AATAATGATT 


ATGAGTTTCT 


TGAAAAAGCC AACACCACGT 


10200 




CCAGCGAGTA 


GTCTTTAATG 


AAGTGAATTA AAGCATATTA AGTTAATGAA TATTTAAATT 


10260 


50 


TTAAAAGGTA 


TATTGaGCAT 


GGCGATTCAT 


GTGCTTCATG 


CTAGGACATG AAACATTCTA 


10320 




TATGGCTCGT 


TTTTAGAACG 


ACAtATATCT 


AAATAAAGCA 


CGCTTArAAG 


TGAGTTTTGA 


10380 
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SO 



TTGCTAAAAT 
TTATTCATAA 
AACCAGTAGG 
CGATGGTGGA 
CTGGTGCAAC 
CTATTGT6TC 
CTGGTAAGTT 
TAAATATCGC 
TGAATACJGTG 
TTGGAGAAAA 
ATATTGCAGC 
ATCGAAAAGT 
AACTTAATTA 
TATTAGTAGT 
TTAAGGTTTG 
CCATTTTATC 
TAATATCTTT 
CTTCTGTGTA 
ATTTTACGCC 
CGTCTTTATC 
CTTTATCATC 
ATGTTTTGGT 
CAATTAATGA 
GAAATTATAA 
TTGAAGATAT 
AAAATGATAA 
GAAATACAGG 
TGTAAACGGC 
GCCTGTCAGT 



CCGAAATGTT 
TGCT6AAGAT 
ATTCAAAATT 
ACTAGATAAG 
ATTCCAAGAA 
TGGCATGTTA 
AGTGACACCA 
ACGTGGGATG 
TCCTGTAGGT 
GCAATATCGT 
AGCTGTTGGC 
CGATGGTGAG 
TTTCGGGAAA 
GGATGCTGGT 
TAACCTTAGT 
TTTTTCTTTT 
ATTTTCTTTA 
ATCTATGTCT 
TTTAAGGTCT 
CATACCTAGA 
TTTATATGTG 
TTGTTCTTTA 
ACATAATTTT 
CATTTACTAA 
6AGTTTTTTT 
AGCGkTAGGG 
AGGATGAATA 
ATTGCTGGTT 
GATCAAATTA 



GAACCTTATA 
TTGATTCGTT 
GTAGTAAGCA 
TATCCAAGCT 
TTACAAGATG 
GAAAAATATG 
GATAAAATTG 
ATGATTAGTG 
GTTGCAACGA 
QTCACAAACT 
GTATCCAGTC 
TTACAAACGA 
TTGAAAGCAG 
CACACAAGAA 
CTTATCTGAG 
AAACCTCTGT 
GTCGAAACAC 
AAGTGyTCAA 
TTGAAAATAC 
TCGTCATATT 
ATAGAAGTTA 
CCACAAGCTG 
TTCAAAGTCA 
AAAATGATGT 
AAGCGGATTC 
AACGTTTTTC 
ACATGAATCA 
ATGAAATGCA 
TTGAAGATAA 



AAACAATCAA 
TCGTCGATCA 
AAGTTTCAGA 
TTATTACGAT 
GTGTTGGCTT 
GTATTCGAGA 
CGATTGCACT 
TCGGTTGTAT 
CAGATGCGAA 
AT6TAACAAG 
CTACAGAAAT 
TACATGATTA 
CGGATTTTAG 
CTTCAAATAT 
GGCATTTTTA 
GCTTTAATTG 
CAAGACGTTT 
TTGCTTTTTT 
TTTCAGATTT 
TAATTGTGTT 
GTACATGTTT 
ATAATGCAAT 
GTCGCCTTCT 
TATTCAAAAA 
CTCACAAAAT 
TGAAAGTTAG 
GTCAGTCAAA 
AGTTAAAGAA 
CTTGGGTGGC 



TTCACCTAAC 
GTTGCAGCAA 
AATTGAAACA 
TGATGGTGGT 
ACCGCTATTT 
TAAAGTGAAA 
AGGTTTAGGT 
AATGAGTCAA 
GAAAGAAAAA 
TTTGCATGAA 
TACTGCTGAT 
TAAATTAAAA 
CX3TTACTGCA 
TAAA6CCCTC 
AGTTATAAAC 
CTTTTCAAGT 
ATTTAATTTT 
ATCTTTATAG 
GGCGAATAAC 
GATTGTAGAC 
ACCACTAACA 
GATACAAACT 
TTCGATATTT 
TTTAAATTTT 
TTTAAAAATA 
TGATACAATA 
TTACTTAAAC 
GCAATGCGTa 
ATTTTTGGAA 



CGTTACGAAT 
TTAGGTCAAA 
CTTGTACGTA 
GAAGGTGGTA 
ACAGCTCTAC 
TTGGCGGCAT 
GCAGATTTTG 
CAATGTCACA 
GCATTGATTG 
GGCTTATTCA 
CATATTGTAT 
CTCATTAGTT 
AATAATTTTA 
AGAATATGAA 
TATTTGTCGT 
TTTTCAAAAC 
TTCATGTCAA 
TCTACTTTGT 
TTTTTGGCTT 
TGTTTTAAAA 
TCACCWTCAT 
AATGCTACTA 
GTATTATAAA 
GTCATTTTTT 
TTTAAGCCTlc 
GTTTTAAGTT 
ATTTAACAGA 
ACTATATAGA 
AGAAAAATGC 



12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

130B0 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 
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IS 



20 



25 



30 



35 



40 



45 



50 



AACAAAGATT 
GTCATGCTAT 
ATCGGTTCTA 
AAAAATATGT 
GAAGTAGGCA 
TTAACTGCGA 
GTTTAAAAGA 
TTGGTTTGCG 
TcGATGTAGG 
TAGGCGGTGG 
GAAAgcATaT 
CA6GTGGAGG 
CAATCGGTGT 
ATGAAAATTC 
AAAATATCAT 
GTTAAACAAT 
inTGTTTTT 
ATGATTTTTT 
TATATTACAT 
AGCGATAGAA 
ATTftTAGGTC 
AGTCCAACTA 
GAATTGCGTC 
GAAGATAACA 
AAGCAATTAA 
ATGTCGGGTG 
ATTCTTTTAT 
AATATCATTT 
GACCAAAGTA 



GATAAACATG 
CTCAAAAAGT 
AACCGCCACA 
TTATAGATAT 
ATATGGTTAC 
ArCATTTGAT 
T6AAAATATT 
TGGTGCGAAA 
TATTGCTTAT 
TCCAGTTGTC 
TAAAGATGTA 
TACAGATGCG 
TACGCTGCGA 
TATCCGTCTT 
GTGGTAATCA 
TGTCTAATTT 
AATTTAAATG 
CTTAAATGTA 
GAGGAGCGGT 
TCATACTAGA 
CATCAGGTAG 
GTGGAGAACT 
AACXSAATCAG 
TGATATTCCC 
TTAAAGATGT 
GTGAGCGGCA 
TAGATGAATC 
TTAAATTAGC 
TGCGACACTT 



GTTTTATTTC 
AACGATTACA 
TGTCTTAACG 
TGGTGTTAGT 
GCCATATAGT 
AATCGCTATG 
GGCATTAACT 
GTGGCAGCGA 
GATACCCCAG 
ATTATGATGG 
GCTAAGGAAC 
GGAAGTATTC 
TACATGCATT 
GTTACTGAAA 
AATCCATAAA 
TAATTCTTAG 
CTGAAAATCA 
ATTGCACTAA 
GCAAATGTTG 
TCATATCAGT 
TGGTAAAAGT 
TTATTTTAAA 
TTATTTGATG 
ATCACTTGCA 
CGGTTTGGGA 
AAGAATTGCT 
GACCAGTGCA 
AGATCAAGGC 
TCAAAAGCGT 



ATTTACGCCA 
ACAGATTCGG 
CCTGAAGAAC 
AGCAAGGAAG 
GAATTTGAAG 
GCTGTGCATT 
TATACAGTGG 
ATACGATTAA 
GTATGTCAGG 
ATGCTACAAG 
ATAACATCXSA 
ATGTCGCAAA 
CTAATGTTTC 
TTGTCCGTTC 
TAATAAAGAA 
TCATTAGACA 
ATTATGCCTA 
AAACCAAAAA 
TTAGAAATTA 
CTAAAAGTAG 
ACATTTCAAA 
GGTAAACCCT 
CAGCAAAGTG 
CGTAATGATA 
CATTATCAAT 
ATAGCGCGCC 
TTAGACGTTA 
GTGGCAATTA 
ATAACAATTG 



kTgGTGGATG 
GCAAAGAAAT 
GTAAAAAGCC 
AAGCTGAAGA 
TGCTTGCAAA 
AGCTATTGAG 
TGCCACAGTG 
ACCAGACTTG 
TCAAACGAGC 
TATTGCTCAC 
AGTACAATGG 
TGAAGGTATT 
AGTGCTCAAT 
ATTGAATGAT 
TCCTTTTAAT 
GTATCCATGT 
AATTTTGATA 
AACGGGAATA 
AAGATTTAGT 
ATAAAGGCGA 
AGCAAATATG 
ATAATGATTA 
ACTTGTTTGG 
AATTTGATAG 
TAAGTTCGGA 
AACTGATGTA 
ATAATAAAGA 
TGTGGATTAC 
TTGATGGTCA 



GTGGAATCAA 
TAGAGGTATC 
AATGGAAATC 
AGCTGGCGTT 
TGATAAATAT 
GTATTAAAAC 
CAAGAAGAAG 
GCGATAgcTG 
GATAGTAAAC 
CAAGGTTTGC 
GATACGACAC 
CCAACGATGA 
GTAGATGATT 
GAAAGTTATA 
ATGGTAGGTT 
TAATAGGATT 
TTACAAGAAA 
ATATACCTGA 
GTATAAAGCG 
GAGTATTGCC 
TAATTTGTTT 
TGACCCGGAA 
TGAAACGATT 
AAAACGTGCA 
AGTGGAAAAT 
TACACCGGAT 
AAAGATAGAA 
CCACAGCGAT 
AATTTCTAAT 



14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

145B0 

14640 

14700 

14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 
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CATTCCGATT ATCATTTCAT 


ATAAAGAAGG 


TTTACATATT ATTAAAGATT 


TAATTGTTGC 


15900 




GACATTACGA GCAGTTGTGC AATTAATCAT TTTGGGATTT TTGCTGCATT ATATTTTTAA 


15960 


5 


AATAAACGAT AAATGGCTGC 


TTATTTTATG 


TGTATTGGTC ATTATTATTA ATGCATCATG 


16020 




GAATACAATT AGTCGAGCAT 


CACCAGTGAT 


GCATCATGTG TTTTGGATAT 


CATTTCTAGC 


16080 


10 


TATCTTCATT GGAACGGCAT 


TACCGCTTGC 


AGGTACTATT GCGACAGGGG 


CCATTCAATT 


16140 


TACCGCAAAT GAAGTTATAC 


CTATCGGCGG 


CATGCTTGCA AATAATGGCT 


TGATTGCAAT 


16200 




TAATTTAGCT TACCAGAATT 


TAGATCGTGC 


ATTCGTACAA GATGGTACTA ATATTGAATC 


16260 


IS 


TAAATTATCA CTTGCAGCTA 


CACCTAAATT 


GGCTTCTAAA GGTGCAATAC 


GTGAAAGTAT 


16320 




TCGTTTAGCT ATAGTGCCAA 


CTATTGATTC 


GGTTAAAACA TATGCGCTTG 


TGTCGATTCC 


16380 




TGGTATGATG ACAGGCTTAA TTATTGGTGG 


CGTACCACCT TTACAAGCGA 


TTAAATTTCA 


16440 


20 


ATTGTTAGTC GTGTTTATTC 


ATACAACTGC 


GACCATTATG TCTGCTTTGA 


TTGCGACATA 


16500 




TTTAAGCTAT GGTCAATTTT 


TCAAT6CAAG 


ACATCAATTA GTAGCACGAA 


ATACTGATGT 


16560 




TAAGAGTGAA TCATGATAGA 


TTTTACTGCA 


TCAGATTTAG GCATTAGTTT 


TAATTGGAAA 


16620 


2S 


TGAAGTGACG CGCACATATA 


GTATCX3CTAT 


TCATTAGCGC AGCGAAAATA 


TTCATAAAGG 


16680 




CACGCATACT TTGTAGTCAG 


TTATCTGTTC 


TGACATATAA AGCGTGCGTG 


CTTTTTTGGA 


16740 




GTTATTGTTG AAACTGAAGT 


AATTATACAT 


AATTATTAAA TGACATACTT 


GTGTTAATTT 


16B00 


30 


TTCAAATACT GAAAAACAAT 


TTCaATAATT 


TTCCaATTAA GCACAGAAAA 


TTAAAGCAAA 


16860 




ATATTATATA ATAGAACGGT 


TATATATaAA nATTngTgCA CACATTTTTT AATAAATOGT 


16920 


35 


TATTCTAAGG GAAATGAATA 


TCGGAAATTT 


TGTTTGAAAG GAGTTTTAAA 


TTGTCAATCA 


16980 


TGCGACTATT TACATTCATT 


TTAAGTATTT 


TTATCGTAGG AATGGTTGAA ATGATGGTTG 


17040 




CAGC^TTAT GAACTTGATG 


AGTCAGGACT 


TACATGTATC AGAAGCTGTC 


GTTGGTCAAT 


17100 


40 


TAGTGACAAT GTACGCTTTA 


ACATTTGCGA 


TATGTGGACC TATTCTGGTT 


AAATTAACGA 


17160 




ACCGTTTTTC ATCAAGGCCT 


GTATTATTAT 


GGACATTACT TATATTTATC 


ATTGGTAATG 


17220 




GCATTATTGC TGTAGCGCCA 


AATTTTTCaA 


TATTAGTAGT TGGTAGAATT 


ATCTCATCTG 


17280 


45 


CAGCAGCAGC ACTAATTATC 


GTAAAAGTAT 


TAGCTATTAC AGCGATGTTA 


TCAGCACCTA 


17340 




AAAATCGTGG TAAAATGATT 


GGACTTGTCT 


ATACAGGGTT TAGTGGTGCT 


AATGTTTTTG 


17400 




GTGTACCAAT TGGAACGGTT ATCGGCGATT 


TAGTAGGTTG GCGCTATACA 


TTTCTATTCT 


17460 


SO 


TAATTATTGT GAGTATTATT 


GTTGGCTTCT 


TGATGATGAT CTATTTACCG 


AAGGATCAGG 


17520 




AAATACAACG AGGCCCTGTG 


AATCATGAGA 


CACCATCTCA TGAAAATCAT 


GTTACTTCGA 


17580 



55 



501 



EP 0 786 519 A2 



5 



10 



IS 



CAAACTCAGT GACATTCGTC 


TTTATAAATC CACTTATTTT 


ATCTAATGGT CATGATATGT 


17700 


CATTCGTTTC 


ATTAGCACTT 


CTAGTAAATG GAATCGCTGG 


CGTTATTGGA ACATCATTAG 


17760 


GTGGTATATT 


CTCCGATAAA 


ATTACAAGTA AGCGTTGGTT 


AATGATTTCT GTTTCTATTT 


17820 


TTATCGTCAT 


GATGTTACTT 


ATGAATTTAA TCTTACCTGG 


TTCAGGTCTA TTGTTAGCAG 


17880 


GACTATTTAT 


TTGGAATATC 


ATGCAATGGA GTACTAATCC 


AGCAGTGCAA AGCX^TGTGA 


17940 


TTCAACATGT 


TGAAGGCX3AC 


ACAAGCCAAG TAATGAGTTG 


GAACATGTCT AGTTTAAACG 


18000 


CTGGTATTGG 


T6TTGGAGGC 


ATTATTGGAG GCTTGGTCAT 


GACACATGTT TCTGTTCAAG 


18060 


CTATCACATA TACGAGTGCC 


ATCATTGGC6 CATTAGGATT 


AATCGTTGTT TTCACATTGA 


18120 


AAAATAATCA 


TTATGCTAAA 


ACATTTAAAT CATCATAATT 


CTCATATGAm AAGCACGCCT 


18180 


GCTATCAAAT 


TCAGGTGTGC 


TTTTTTAGAT GCGATAACGT 


TATTGATATG TGCGATAATA 


18240 


GCGACGTTCA 


TTATGATACA 


TCGGCCAAGG CATTTTACCG 


CTTTTAGCAA AATTAGCTAA 


18300 


ATCATTTTGC ATTTGTCGAC 


TTAAAAATTT AAGGTGaGCA 


GTTGTTGGaT ATgAT 


18355 



(2) INFORMATION FOR SEQ ID NO: 68: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
CGCAAAGAAG TACAAAAAAT GTTTTTACAA GAAGGTATTA AAACACCTCA ACCAATTATG 60 

35 

ACTGCTTATA ATCATAGTGA AAACGgTGTT TAGTAGTTTA TAATACATGG AGGTCATATT 120 
TAATGGCGTC AAAATATGGA ATAAATGATA TAGTAGAAAT GAAAAAACAA CATGCGTGTG 180 
GAACAAACCG TTTTAAGATT ATTAGAATGG GTGCAGACAT AAGAATTAAA TGTGAAAATT 240 

40 

GTCAAAGAAG TATTATGATT CCACGTCAAA CGTTTGATAA AAAACTTAAA AAAATCATCG 300 
AATCTCATGA TGATACACAA AGATAGGAGA ATGATTAATG GCTTTAACAG CAGGTATCGT 360 

45 TGGATTGCCA AACGTTGGTA AATCAACATT ATTTAATGCA ATAACAAAAG CAGGTGCTTT 420 

AGCAGCGAAC TATCCATTCG CTACGATTGA TCCTAATGTA GGGATAGTAG AAGTGCCAGA 480 
TGCTAGATTA CTTAAATTAG AAGAAATGGT TCAACCTAAA AAGACATTGC CGACTACATT 540 

^ TGAATTTACA GATATCGCTG GTATTGTGAA AGGTGCTTCA AAGGGAGAAG GGTTAGGTAA 600 

TAAATTCTTA TCACATATTA GAGAAGTAGA TGCGATTTGT CAGGTCGTTC GTGCATTTGA 660 
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TAATATGGAA TTAGTACTAG CGGACTTAGA ATCTGTTGAG AAACGTTTGC CTAGAATTGA 780 

AAAATTAGCA CGTCAAAAAG ATAAGACTGC TGAAATGGAA GTACGTATTT TAACAACTAT B40 

TAAAGAAGCT TTAQAAAATG GTAAACCCGC TCGTAGTATT GACTTTAATG AAGAAGATCA 900 

AAAATGGGTG AATCAAGCGC AATTACTGAC TTCTAAAAAA ATGCTTTATA TCGCTAATGT 960 

TGGTGAAGAT GAAATTGGTG ATGATGATAA TGATAAAGTA AAAGCGATTC GTGAATATGC 1020 

AGCGCAAGAA GACTCTGAAG TGATTGTTAT TAGTGCAAAA ATTGAAGAAG AAATTGCTAC 10 BO 

ATTAGATGAT GAAGATAAAG AAATGTTCTT AGAAGaTTTA GGTATCGaAG AACCAGGATT 1140 

AGATCgrTTA ATTAGGAmCA CtTATGAATT ATTAGGnTTA TCCACCATAA TT 1192 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 7494 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



25 . (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

AATATAGCTG CAATAGCATC TCGTTTCATT TGTATAATCA ATTCCGGTTT AAATATCAGT 60 

GTGAACGTAA GCACGACACA GATTAAAAAT AACACTGCCG GAATGAGTCG TTTCAATCGT 120 

CGCTtCCAAA ACTCTAGCAA ATCGATTTTT TGCGTCCGAT AATACTCACT TATCAACAAA 180 

CTTGTTATTA AATAACCTGA AATAACGAAG AATGTATCTA CTCCTAAAAA GCCCCCACTT 24 0 

AACCATTGTG CATTCAAGTG ATAAATAATG ATTCCTATAA CTGCGAATGC CCTCAATCCA 300 

TCTAATCCAG GTAAGTATCG CGGGGAATAC ATTTTTTCTA AACGTTTAAA GTCTTTTGTA 360 

TCCA'feTTAA TAAACGCCCC ATTTATTTT T CTCTATTTTG TAGTATATCA CAATATTTTT 420 

GAAAATAAAA TATTGCACTG aTTTTCATTA ATTGATTTAA CCCTTAATTA AGATAGTTTT 480 

AAATTTTTTA TTAAGTAGAA AACAATTATT ACAGTTGATT TCATTACTGC AAACCACATA 540 

TAAATTTGTC GATTTTACTA CATAACATAG ATTATCATAG ATTCTTGAAT TTTTAGCAAA 600 

^ ATAACTGTTA TTTTCATTAT ATTTTTACAA AAAAAGGTTC GTTTTATATT TTATGCATCT 660 

TACTGTAACA GAATCATTAA GATATGCTAT TCOAATATAC TTTTTCAAAA TTTATATAAT 720 

GAATAAATTA ACATGTATTG AAAAAAAAGC GAAATGCAGC CTATCCTCTA ATGTAAACCA 780 

50 AACGATATAT CTCGTCAGAC TTTATATTTA AACGCTATGT GTCACTTTTA AAATGAATAT 840 

TACTAAGATT GTCATATCAA TTATTATTGC ATCGAATTAA TCTTTTAAAT TTCTGTAATA 900 
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ACX3GAAGTCA TTATTAGAAT AAAAATACTG TGCACTAATA AATTTATCAA TTGTTCCTAA 1020 

ATAAATACCA TCGATATTTT GTTCTTTACA TGTCATTATA ACTTTATCTA AAAGTTTTTT 1080 

ACCTATTTTT AAATTCCTAT AACCTTTATC AACAAACATT TTTTTAAGTG CAGACATATT 1140 

ATTATCTAGT CTAATCAAAC CTATAGTACC AACAATATTT TGaTGATTGT TTATTGCAAG 1200 

CCAAAATgCC CTCCATTATT CAAATAGTTA TGTTCGATGT TCTCCAAATC AGGTTGATCA 1260 

TCTCTATCAA TTTTTATATa AATTCATTTT TTTGAATCGA TAAAATAAAC TCGATTAGCT 1320 

CTTCCTTATA AGACCTATTA TATTCAATTA TGTTTATAGC CATTTTTATC TCCTTTTTCA 1380 

TTTAATTTAA TTATAAAATG TGCGTTTAGT TTGTATCTAG TGTACTCAGT ACAGCCTCAA 1440 

ATGAAGTTTC ATTCXJICTTG GCACTTAATA AAGACAAGTA TTTTAGCA6T AATACAATAA 1500 

AGTCCAATAA ATTTCCCTAA CTTCAATATC CACTTTTTAA AAAATGTATT TTTAATTAAT 1560 

20 AAAAAAACTC TCCCCAATTT CTATGGGAAG AGCTATATAT TTAATGTCTA AACATTACTT 1620 

TTATTTATTA TGAAGGAATT AGAATCCCCA AGCACCTAAA CCTTGTGCTT TGTATGCTTT 1680 

AACAGCTGCX3 TTGATTTGTT GGTCAACAGT GTTT6TTGGA CCCCAACCTC GCATAGTTTG 1740 

25 GAAXAAACCT 6AAGCACCTG ATGGGTTGTA AGCATTTACT TGACCATTTG ATTCACGAGC 1800 

GATGATTGCA GCCCATGTAG AAGCTGAAAC ACCAGTACGT TGAGCCATGA TTTGAGCTGC 1860 

TGATGAACCA GTAGCACCTG CAGTATTACC ATTGCTTAAT CTCACTGAAC TTGAAGTAGT 1920 

TGAAGTGCTG TAGTTATGGT AAGTTGGAGC TGAAACAGCT TCAACGTtTG AGTTACTTGA 1980 

TTGTGCATTG TAGCTTACTG ATTGTACATT TGAACCTTGG TTGTATGAAG TAGTGTAGTC 2040 

TGCACCTGCA ACGTTTGAGA AACCAGCAGT TTGACCATTA GCTGCTTCAT AGCTCCATGA 2100 

CCATGTAGTA CCATTTGAAG TGAAGTTATA TTGGAAACCA TCTTTTACAA AGTGGATGTC 2160 

ATA'reCACCA TCTTTGATTG GAGCTGCATT TAATTGATCT TGGTGATTAT GCGCTAAGTC 2220 

AACTAAGTGT GCTTGATCAA CGTTTACTTC AGCAGCGTGT GCTTGATGTC CTGTACCTGC 2280 

TGCGTAACCT GTTACACCTA ATGCCACTGC TAATGAT6AT GCCATAATTG TCTTTTTCAT 2340 

AGTAAAAAAT CCTCCAGTAA TAATTGTnAG TTTATGTTTT TAGTAATTAT AtTTTGaATT 2400 

45 TGAATGTCGT AGTgCAAGTT TAAATTGTCT TTTATTTCTT TCaACGGTAC TCACTATATC 2460 

ACAaAAAACC AGCCAGTAAA TTACACTTTC TTTACAAAAC ATTACAATAT CAAGTGTTAT 2520 

TTGtAATGTT GAAATATGGC TGTTTTATAC TGTAATGTGA AATATGTGCC CTTTAGAATC 2580 

SO CAATCAACCC TTGAAATAGT CTTTAACACA TAAGATTTTT ACTATATTTA GCTCAACTAT 264 0 

TACAGCTTTC GTAATATTAC AGATTGTATT TTTGTTACAT AGCTGTAATA TATCTGACAT 2700 
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45 



SO 



55 



TACACATGTA TTGATTGCTA 


TTATTGTTGT 


ATATTCAAAG 


TTTTAAAACA 


CACATCTTTT 


2820 


GTGAATTGTC TTATCTTTTA 


TTAGCGCAAA 


TAAACTGCAG 


CTCAATTATA 


TTGTTCAACT 


2880 


TCATTCTCGC AATTCACAAT 


AACATTAAAT 


AATTTTTGGT 


CTCATATTTT 


CAAAAAACAT 


2940 


ACTGTTATTA TCCCATGAAT 


TTAAAAATAT 


CATTAGTATA 


TAAACGAAAC 


ACTTTACGAT 


3000 


AAATGATATC TGCAAGCCAA GCTGTTACAA ATGGTACAAC 


AAAGAACGCT ACTACAATTA 


3060 


GTAAGACACT CAACCAAGCA GAATCAACCT 


CCATAAATTT 


AAATGCATTA ATCGGTCCTA 


3120 


CCATTCCTAT AAAACCAAAT 


CCAGCTGACT 


CTTTCGTTCC 


ATGAATACCT 


ACTAATGCTG 


3180 


ATACCAAACC TGATACAATG 


GCTGTCGTTA ATATTGGTAA 


CATAAGAATT 


GGATATTTCA 


3240 


CCATATTAGG TATCATCATT 


TTAACGCCTC 


CAAAGAAGAC 


GGATAACGGC 


ACCCCTAAAC 


3300 


GATTCACTTT ACTTGTACCA ATTATCAATA 


CTGCTTCAGT 


CGCGGAGATA 


CCAATTGACG 


3360 


CTGATCCAGC TGCTAAACCT 


GTAATACCTA 


TCGCAAAGGC 


AATGGCCACA 


GTTGATAGTG 


3420 


GCGAAATAAT AATAAGACTA AATACCATTG 


AAATCAAAAT 


ACTCATGACA ATCGGTTGTA 


3480 


ATTCTGTAAA ACCATTAACC 


ATATTACCX3A 


TGGCTGTTGT 


AATCATTTTC 


GTATACGGCA 


3540 


ATATXAAAAC ACCAATTGCA 


CCTGAAATAC 


CGCCAACAAC 


TGTTGGGAAT 


ACAATCAATG 


3600 


CCATACTACC TACGCGATGT 


TGAATAAGTA AAATGAATAA 


CACTGCAATC 


GCTGCTGTAA 


3660 


TCATTGTATT AATTAAATCA 


CCAATACCCG 


TAATCATCCA 


AGCACCATTT 


TTAAACTGCG 


3720 


CTGCACCGCT TCCTACATAT 


GCTGCACTTG 


CCACAACAGC 


AATTGCTAAT 


GGCGATAGGT 


3780 


CAAATTTCAT GGCAACCAAT 


GCACCAATCA 


AAGCAGGTAC 


TGTAAATTGA 


ATTGCAACGA 


3840 


CAACGCCTAA TAACGTTTTA AAAATCGGAT 


GATAATCCAT 


AAAGTATTTA 


AAAATTTCTC 


3900 


CAAGTATCGC ATTAGGAACT 


AAACCCGCAA 


CAATACCTAT 


GGCGACACCT 


GATAAAACTC 


3960 


TAAATATAAA ATCTTTGGGT GTAATTGTTT TAATT6ATGT 


CATAATATCA 


TCCTTCCATT 


4020 


TATGTATATA CATCTGTATG 


CAAATAATAA AGAGCCTTAA 


GTTATAAGCT 


GCCACTAGCT 


4080 


TAAATTCTAA GATGTGCATG 


CCGATGTTGT 


TATATTTAGG 


CTAGCAGTAT 


CATCTATAAC 


4140 


TCAAGACTAT GAAAAATAGT 


ATATCACAAA 


ATTCTGAATT 


TTTAGATAAA 


TAAATTGGCA 


4200 


ATTTTTCAAA CATATTGTTA 


CAATACACTT TTATTTTATC 


TTCATTTTTA AAATGCATTA 


4260 


ATACAATAGA AGAAAGACAT 


TCAAATGCTT 


ACCAAAAAGG 


TACATTATTT 


GTTAGGAGCG 


4320 


TATCAGCaCT TACATATCAT 


CAACACAATT 


GACAATATAA 


TAGAAGATAC 


TGATAATAAG 


4380 


TGTTAAAACA ACAGATGTTA 


GGTAGTGAAC 


AAATGATGGA 


AAGTAAATCC 


ATAGATCCAA 


4440 


GAATCGTTAG AACCAAACAA 


TTGCTTGTCG 


ATGCTTTTCT 


TAAAATTTCT 


AGAGAAAAGA 


4500 
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TTTACGCTCA TTTCGCTGAT AAAGAAGACC TCCTAGACTA CACATTATCT GTAACCATTT 4 620 

TAAAAGACTT GAATGATAAT TTGAGCATTT CTAATGTCAT TAATGAAAAG GTTCTGCGTA 4680 

5 

ATATTTTCAT TTCAATTGCG AGTTATATCA AAGATGCTGC AAAGTCTTGC GAATTAAATA 4740 

GTGAAGCATT TTGCAACAAA GCACATCAAC GTATTAATAA TGAATTAGAA GATATTTTTG 4800 

CGATTATGTT AGAAAACAGC TATCCGGAGC ATCAACGAGA TATCATTGTA AATAGTGCGA 4860 

10 

GTTTTTTAGC AGCTGGTATC TCAGGCTTAG CATTACATTG GTTTAACACG AGTCAAGAGA 4920 

CAGCXTGATGT 6TTTATCGAT CGCAACCTTC CATTTTTAAT TCATCATATA GCACATTTTT 4980 

AATAAAACTT GGTATTTAGT CATGCATCTT GAAATCACTA TGTGACTTAG GTTCATACTT 5040 

GTACACACAA TAAAATTTAA CGTATTACGA TTGATTAGCC GTGTCTAGGA CATAAATCAA 5100 

CGTCCTATAC TCTACAATGT CATATTAGCA GTOGTTAACT GAATGAAAAT AAGCTTGTCA 5160 

20 TTAAAACATA TAGATTTTAG TGACAAGCAT TTTTGTTTTT GCXSTACTTAA ACAACACTTC 5220 

AGGCAATATG rTGTTTAGGC AACAAATGAT ATGTGCGTGT TTATTGGCAA ACGTACGACA 5280 

TAGTAGTATA GTATGTCTAA ACAACATATG TTGCATAGTT GATATGCX3TT GTTTAAATAC 5340 

TAAGATAGGA GGGATTGACG TGAGCGAGAC AGATGAACCT CAGGGGTTTG AACGCACGCA 54 00 

TAATATATTA AATATTAATC AGAGTAGTCT GGGTGTAGTG ACATACATTA CAAATAAATT 5460 

AAAGTCGACG TTGAAGCAAC ACATAATAAT TGCTCGTGGT AAAAAGCGAA TCGACTATCG 5520 

30 

ACTGTCGTAT AACTTTTACA TACGTATTAT GATAATGTAG AAATCAAGAA AATCGACTGT 5580 

GAATATACCT ATGCTATGCC CATTGCAATT TTAATAAGAC ACACX3ATGTC ATTCGACAAT 5640 

GCTCATTTCT TTGCTCAGTT ACGTCATCCT GTCTTATAAA ACAACATTGC AGACATGTAT 5700 

35 

ATCAAAOGAC ACTTCAATAA CATCACTTTG CCcATCGTAC TACTAGTAAA ATCGTGTCTC 5760 

AAATCCCTTA TTTTAATTCC AAAAAtCTGC TGGTCAAAAG ACCGAGAAAC TAAAAACATT 5820 

ACTTAATGTG TTGATAAATT ACCATATAAA AATAATCTCA AAATATATCA ACACTTGATT 5880 

CTAAGGAGGA TATGACAATA TGAAAATTTT AGATAGAATT AATGAACTTG CAAATAAAGA 5940 

AAAAGTACAA CCACTTACTG TAGCTGAAAA ACAAGAACAA CATGCATTGC GTCAAGAcTA 6000 

45 CTTAAGcATG ATCCGAGGAC AAGTATTAAC AACATTTTCC ACAATAAAAG TGGTTGATCC 6060 

AATCGGTcAG GATGTCACAC CAGATAAAGT TTATGATCTT CGCCAACAAT ACGGTTATAT 6120 

TCaAAATTAA tATTTGCTCA CGAGGTATTG CACTTAAGGT GCCAACTGAC CTCATAAACA 6180 

AAGCCCATAC TGATTGAAGA CACTAATGTG tCsaCCATGG TGCACATTAC GCTTCATCTC 6240 

TGTATGGGCT TTTTATTTAT TCTTTTGAGA ATTTCATTTT AGCAGACCAA AAAATTAAAA 6300 
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TGAACGACTG TGCCACCCGC TTCTTTCACT TTATTCACCA ACTGGTCAAC TTCTTCATTT 6420 

GTGTTCACAC CTAGAGAAAT CATCACTTCA TTTGGTTCAG TATTAAGGCT TTGCTGACTT 64 80 

5 

ACATTTTGAA AATGCTTGTn TTCTATTAAA ATTACGGkTG tTTGACCTAT tTGAATGCCG 654 0 

ACCATTTTAT CTAACATTTG TGGGTTTCTA TTTATTTTAA ATCCTAACGC TTTATAAAAC 6600 

TGTGCGCTCT TTTCTAAATC TTGCACATGC AAATTAAACC ACATTGATTG AATCATGATT 6660 

10 

GCACCCCATT CATTACTTAT TATAGTTTTG GACTTTAAGC CAATCACTTA ATGATAATCT 6720 

TGTTGGATTT ATTTCAGCCA TTAATTCAAA GTCTACTTCA TAACCTTTTT CTTCCAACCA 6780 

TTGCTTTTCT GCAACACCAC TAACAAATTC TCCTTCTATA ACAGTAGATT TACCTGTCAC 6840 

TTCACTAAAA ATTGTTGCTG CTTCACTTAA T6TAACTTCA TCGGAACCAA TCTCTATTGA 6900 

TTGATGCX5TA AAGCTTTGTG GATGTGCAAA AATATACGAT GCAATTTTAG CTATATCAAT 6960 

20 AGAAGAAATC ATTGT6AATT TTATATTCGG ATTAATAAAT TCTGGTAATG TAATACGTTC 7020 

ATCTTCGACT TTAGCAATGC GTAAAAAATT ATCCATAAAG AATGATGGTT TGATAACTGT 7080 

TGCATTTATA TTAGATTCCA TTAATCTATT TTCTATTTTT GCTAGTACTT CAAAGTGTGG 7140 

GCCAGTTCGA TTTCGATTAA CCCCTCCCGC AGTACTATAC ACAATATGTT GAATATTTTC 7200 

TTGCTCAGCT ATTTCAATTA TCTTCATACC TTGTCTTAAT TCTTCGCTAA CATCATCTTT 7260 

AACGATTGGC TGAATACTGT ATAAGCCATA CTTACCTTTC ATCGCTGATT GCAAACTAAC 7320 

30 

ATTATCACTC AGATCACCTT CArCGATTGA TAAATGCGGA TGTCCTATGT CTGAAAGTTT 7380 

ACGATTATnC TTATTTCTAG TTAATGCACT TACATACCAT CCATCCTCTA ACAACTGTTT 7440 

TACAACTGCA TTACCTTGCT TCCCTGTTGC GCCTATTACn AAAATATCTT TCAT 7494 

35 

(2) INFORMATION FOR SEQ ID NO: 70: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11802 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

AATTTATTTC GCCGTCCCAC CCCAACTTGC ATTGTCTGTA GAAATTGGGA ATCCAATTTC 60 

TCTTTGTTGG GGCCCcGCCC CAACTCGCAT TGCCTGTAGA ATTTCTTTTC GAAATTCTCT 120 

GTGTTGGGGC CCCTGACTAG AATTGAAAAA AGCTTATTAC AAGCGCATTT TCGTTCAGTC 180 

AATTACTGCC AATATAACTT CGTAGATCAT AGAACATTGA TTTATTTCCC AGCCTATTCT 240 
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AGCAAAGGTA ATAATGATAT TAATAATGTA CAAAAAATAT AAATCAAATC GACATCCTTA 360 

TAAAACATCA GAACCACTAA AAACAAAAAA GCACAAAATA AAATTAAATT TAAAATAAAC 420 

5 

GACCACTTTT CAAAAAAATC TCtTTTCaTa TTTCCACCCC TAATTTTAAT AAGCATTATT 480 

TTATATTCTC TTTTAAGTTT ATTATTCAAA AGGAAAACAG AAATATCTTT CaATATTATT 540 

ATAAACATTT CAACTACTTT TAAAAACCAA CAAAAAAATA CTTATTTTAA GTAGATGAGC 600 

10 

ATAAGTGAAC ATAGTTCTTT AGTTATAATA ATTAATTCAA CCAAAAGTCG ATTTGTTTTT 660 

GC7ATTGGTT TTCATTTCCT CTTAAAGATA TTTTCATTAA ATCTGTCAAA TCAATAGACG 720 

IS CTATATTTTT CAACTTATCT CTATATTTAT TTTTAGTACG TCTTTCTAAA TTTCCCCATT 780 

CCTCTTCTTC GTGAGTTAAT AAATGAAGCA TTGCTCGTTC TTGTATATTT TCAATCATTT 840 

TTAAATTCGG TTTTAAAATA TGCAAATCAT CAAAACAATC TTTCCAACAA TCAACCATAT 900 

CTCGTTTTAA TTCAATTTCC ACACGCCATA GAAATGTTGA ATCAATTTCA ACATCTGCAT 960 

TATCTTTACG TTCTTGTTTT TATTATAAAT CCGAATAAAC CTATCACTAT TACGCACACC 1020 

AAAATATTTT GTTTCTGGTT TTACATTACG TCCATAAAAT ATAGTTTTCT TTACCGACTT 1080 

25 

ATCTGACAAT GCATAATAGT CATTTAAATC AAATTCAAAA TCAAAAGCCA AATCTAATCT 1140 

CGTAAAACTA ACATCGTCCA AATAACTGAT GATATTTTGT TTTAACCAAA GCACTTCATC 1200 

ATGCGAAAGC TTATTAGGAT TAAATTCAAC GCGCATAtAC GTCTATTCCA AAGAGTTGCT 1260 

TTTATTTTGT CATATTCAAT ATAAACTTTT TCTTTAAGAG CTTTA6CTTT AAAGTTTGTT 1320 

TGTAAAATAT CCCAAAGCCG AATTTCAGGA TTAGTACTCA TAAAATGTGA AAGTCTCTCT 1380 

35 GCGTTAGACA TGCTAAGATT CCCAACAATC GTTATAGCGT CAAAAGACAA TTTTGGAATA 1440 

GCTAGTGACA TCCTATGTCG ATTTAACCX3G CTATTACCGG ATATTAGAGT ATCCAGTTTT 1500 

ACAAATGGAT GAAACGAAAT TCAAAACACT AAAAAATATG TTCCACTAAC AGCAAAT^AAA 1560 

40 

TACCATTATG TTCCTACTAA AAAACyAAAA ATACTGGAGA ACAAATGTCA GGATATAACT 1620 

TAGGATACTA TGTAATAAAA ATTTACAATA AAAAAACAGG AAAACAAATT TCAAGTAAAA 1680 

GmATACCCAT ACAAAGAGGA TAAAATAAAA AACCTCGAAC TGaAATGATG ATCTTTTCAG 1740 

45 

CTCGAGGTTT AAATATTGGT GCCTTATTTA TATAGATTCG TTATATTATA TTCTCTATTT 1800 

TCATTAACraT AATCCTTAAA GAGTTTTAAA TTAATACCTG CTAGATGATT CAAAAATGTT 1860 

50 TCATCAACTT TTAAATAATT CAATAATTTT TGTGGTGTCA GTAAATnTCT ATCAAAATAC 1920 

AACTTTAATA AACTATTCAT TTTGACAGGA CGTGACATTT CAATCACGTC GTCTAAAGAT 1980 

AATACTTTCT CGCTTTAnAC AAAnACAAAA ACTTACCCGA TTAAAATCAA GTAAGTTTTA 204 0 
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TATTTGATAA AAAATCAATA AGTAATTGTG CGCCTTCAAC TTGAATATCT TTTACAACTG 2160 

GCGCGTCGAT ATACATATCA TACTGACCAC CGCCTACTGC ACGATAATTA TTTACACAAA 2220 

5 

TTGTATATGT CTGCTTTAAA TCAACTGCGT GACCTTGAAT CATCATATTG CTCACACGTT 2280 

GTCCCTTTGG TCTTCCAACA TGAATGGTAT AACTTACGCC ACCATATATA TCATAATTAA 2340 

AGTGTTGTGG TTTGGGTTCA AGGAAGTCTG CGCTCACACT AACTTCATCA TTTTTCACGT 2400 

10 

CAAAATATTC TGCTGATCGT TCAATGGCTT CTTTAAGTTT GGCACCACTT ACAGCTAAAA 2460 

CTTT/^TGT ATTTGGAAAT GGGTAATTGT TAATAACATC TC6CATCGTC ACGACTTGCT 2520 

IS TGAAACCACT AGCAGAATCA AACAAAGCTG TACAGGCAAC ATCTGCGTCA CTTTTTTCTA 2580 

ATAAAGCGTA ATTCATAAAA TTTGTAAAAG GATGCGGTGC CACACGTGCC TCAAATGCAT 2640 

GATTAATCGT CATATCATAT GGCAATGTAG TAATTTCGTA ATCTAACCAG TCCTCTAACT 2700 

20 

GCTTTCGTAA ATGTTGGTCA TCTTCATCAA TAGTAAATGT GGAATCATCT ATAACAGGAA 2760 

GTAATTCACA TGATTCAACG GATAGATTTT CATATTCATC AGTACTCAAG ACTACTCTGC 2820 

CTACAGTTGT ACCTCTCGTA CCAGGTTGAA TCACAGCCGT TTGCTTAAAC CTTTCAGCAA 2880 

2S 

TTTGTCGATG TTGGTGACCC GTAATAAAGA TATCTATATC TTTAGAAAAC GCTTCTAACA 2940 

TGGCATATCC TTCATTTTCA CCCGTTAATA CTTCGGTCGG CGTACCACTT TCTAAATCCT 3000 

30 TTTCAAATCC ACCATGGTAA CAAACCACAA TGATATCTGC ATGTCGCTTC ATTTCAGGTA 3060 

AGTATTGTTG AAGTATTTCA AAAGCACTAT GAAACGTArT GnCnTGAATA TGCTCTGGTT 3120 

GTTCCCAATG GGGAATAAAT TGTGTCGTTA AACCTATCAC ACCAACAGTT TGATCTCCAA 3180 

^ CCTGAAAATA CTTCACACCG TTATCAGTCA ATGTACTATC ATTTTCATAT ATATTAGCGC 3240 

ACAAAACTGG ATAATTGAGT CTGCGTAAAG TGTCTTTTAA GTATGGTAAT CCATAATTAA 3300 

ATTCATGATT ACCAAGCGTA CCAAAGTCGA ATGCCATTCG ATTATAAAAA TCAACTAAAG 3360 

40 

GCTGGCTACT GCCGCTATGC GCGATTAAGT AATTACAAAA TGGTGACCCT TGCAAAAAAT 3420 

CACCATTATC TATTTTAAAA CTTTGGTCAT ACTGCCTTCT GTsTTGTTCT ATAACATGAT 3480 

TCGCTAGTAA CAATCCCATA GGTTGATATT GATTTCTACT CGTAAAATCT GTTGGGAAAA 3540 

45 

TATAACCATG TACGTCACTC ACGACATAAA ATGCTATGTT TGACATCCTC ACTCACTCCT 3600 

TCAATCACAA ACATCTTTCT TATTTCTATT ATATATTTAT TTGAAGTCTG TTGTAATCAA 3660 

SO GGTTTTGTCA CCGAGTTTTA AACGAATCTT TGAACCTTCC ATACTTTCAA GTACTTTAGC 3720 

ATTGACCTTA ATTGTGACAT TTCCGTTTTC ATCTGCTTTA ACTGTTGGCA AAGTACTGTA 3780 

ACCTGGTGGG TTATAATCX5T TATCTTTACT TGAAAATTGT CCGATTTGAC GTCCGCCTTC 3840 
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30 



35 



40 



45 



SO 



55 



TATTGTCATT TCAAATGGCT CATTTACAGA AACATTTTGC GGGATATCAA ATGTTACTTT 3960 

TTCGTTCTGA TTTGGTGGTG TATGATCATC TGGTGTGTTT GGCTGAGGAT CTGCGCCTTT 4020 

TTCGCTGCCA TAACTACCTG CTTTAAATGT TGTTGGATCA TACCATTTAT AACCACTCGG 4080 

CGGTTGTGAC CATGGCTCTT TTTCAGGCTC AGTTGAACGC TCTGGTCGTT CAAAATCAAG 414 0 

CAACTTAGTC TTTGTATCTA ATGTTAGGCT ACTCGCCTTA AGTGATTTCC CATCATTATC 4200 

TTTAGACATC CAAGCCGTTA TATTATTTAA TAGCTTACCG TTGTCTTGTT CTTTAAAACC 4260 

ATCATATGTT TTCTTCTTTT CTCCATTATC TTCTCTTACA TATTTGGGCG AACTATCTTC 4320 

CACAAGTGAT GAATCACCGA TAAATGCTGC TTTACCTTTT CCAACTTTAG AAATTGCTAC 43 BO 

ATAGGGGCCT TCTGCTTTAC CGCCCCCATT ATAAATACCT TGATCTACAG CATGTGACCA 4440 

TTTACTTTTC GCTGGCAATT GTTCTGGTGT ATACACAATA CCTTTTGCTT TCTCTGGATT 4500 

AGTAATTGCT AATGTCGATC CGGCATGCAT AGAGACAGAT TTCACACCTT CAGTAATACC 4560 

GAAACTTTCT TTTGAAGAAA CAATATTGCT CGTATTTAAA TCACCTAGTG CATTATATCG 4620 

AAAACGTACG CCAAAGTTTG TAGATAACCA ATCTGAACTT TTCACACCTT GCATTGCAGT 4680 

AGAACTTTTT TCTTCTGCAT TCATACCTTT CGACATATCT TCATATGCTC CACGTCGATA 4740 

ACCATTCATT GCCTCCGATG AATCAATACG ATTTAAATTT CGGTCAGCAT TGTAATGATC 4800 

TGAAATAAAG ACAACATTGC CACCTTGTTt CACATATTTA ACAATTGCTG CCTGTTCTCA 4860 

TTCTTTGAAA GGAATGTTAG CCTCAGGAAT TACAAATATT TTGGAACTTT TCAAACTTOC 4920 

TTCTGTTATG TTCGAATGAC CATCAATAGC TTTAACGTCA TAACCTTGTT TTTGTATTGA 4980 

ATCCGCATAA TCTGAAAATG CACCATCACT AACCCAATCT GCAGCACCAG CTGTTTGACC 504 0 

ATGAGAACGA TCGAATAATA CCGTTCGCTG TTGCTTTGTA GGTTGCGATT CATGCGTTAT 5100 

AGCTAAAGAT TGCGGTAAAG CACTTAATGA TACCGTTGCA ACAATTGCAG AGACAGTTAA 5160 

TGACTTATAT ATTTTTTTCA TTTTGTGAGG CTCCTTTTAA AATAAATTTC TTCTTGAATT 5220 

ATAGGATAAA AATTCGTTGC ATATGAGCAA TTTAACGAAA AATTTACAAA ATCTTATCAA 5280 

ACTCTTAAAG AAAGTTATTA AAATTCATTT TTATAAAATA CTTTTTAACA TTTAAATGTG 5340 

GTACGCTATA AGTGTAATTT CATTGCATAC ATATTACACG ATTAAGAATG TGAAGGGGAC 5400 

AGTTATCAAA TGAAAAATTT TAAGTGTTTA TTTGTATTAA TGTTAGCAGT CATTGTTTTT 5460 

GCAGCAGCAT GTGGAAACTC AAGTTCTTTA GATAATCAAA AGAACGCTAG TAATGATTCG 5520 

GATTCTAAAT CAGGAGGATA CAAACCTAAA GAATTAACCG TTCAATTTGT ACCTTCGCAA 5580 

AATGCTGGAA CATTAGAAGC TAAAGCAAAA CCATTAGAAA AATTACTATC TAAAGAATTA 5640 
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TCTAAAAAAG TTGATGTTGG TTTCTTACCA CCAACGGCAT ACACATTAGC ACATGATCAA 5760 

AAAGCAGCTG ATTTATTATT ACAAGCACAA CGTTTCGGTG TAAAAGAAGA TGGTTCAGCA 5820 

5 

AGTAAAGAAC TTGTAGATAG TTATAAATCA GAAATTCTTG TTAAAAAAGA CTCAAAAATT 5880 

AAAAGCTTGA AAGATTTAAA AGGTAAGAAA ATTGCCTTAC AAGATGTAAC ATCAACTGCT 5940 

GGATATACAT TCCCACTTGC GATGTTAAAA AACGAAGCAG GTATTAATGC AACTAAAGAT 6000 

70 

ATGAAAATTG TGAATGTTAA AGGTCATGAC CAAGCAGTTA TCTCATTATT AAATGGAGAt 6060 

GTAGATGCTG CGGCTGTATT TAACGATGCA CX3TAATACTG TGAAAAAAGA CCAACCAAAT 6120 

IS GTATTTAAAG ACACACX3AAT TTTAAAATTA ACACAAGCTA TTCCGAATGA CACAATTTCT 6180 

GTAAGACCAG ATATGGATAA AGATTTTCAA GAAAAATTGA AAAAAGCTTT TATAGACATT 6240 

GCTAAATCAA AAGAAGGTCA CAAAATTATT AGCGAAGTTT ATTCACATGA AGGATACACA 63 00 

20 

GAAACGAAAG ATTCAAATTT CGACATTGTA AGAGAGTACG AAAAATTAGT TAAAGATATG 6360 

AAATAATCAT TATTTAACAA ATGAATCATT AGCGAATTTG GTATTAAAAG CTTTCGTTCA 6420 

ATAGATATAT TCTAGATTAA TATTGAAAAG CTAGGCGCTA AACTGAAACA GATATAGAAA 6480 

25 

GGTG.TCGCTG TACATTTGAA ACCATTTGTA CACA6AAACC CAATGTCTAT GATATTTCAG 6540 

TTTACCTTGG CTTTTCTTTA TTAAAGAAAG GTGTCAAACA TGAGTCAAAT CGAATTTAAA 6600 

AACGTCAGTA AAGTCTATCC TAACGGTCAT GTAGGCTTGA AAAATATTAA CTTAAATATT 6660 

GAAAAAGGTG AATTTGCAGT TATTGTCGGA CTATCTGGTG CTGGGAAATC CACGTTATTA 6720 

AGATCTGTAA ATCGTTTGCA TGATATCACX3 TCAGGTGAAA TTTTCATCCA AGGTAAATCA 6780 

35 ATCACTAAAG CCCATGGTAA A6CATTATTA GAAATGCGCC GAAATATAGG TATGATTTTC 6840 

CAACATTTTA ATTTAGTTAA ACGGTCAAGT GTATTACGAA ATGTACTAAG TGGACGTGTA 6900 

GGTTATCACC CTACTTGGAA AATGGTATTA GGTTTATTCC CAAAAGAAGA CAAAATTAAG 6960 

40 

GCAATGGATG CACTAGAACG CGTCAATATC TTAGATAAAT ATAATCAACG CTCTGATGAA 7020 

TTATCAGGTG GCCAACAACA ACGTATATCT ATTGCACGTG CGCTATGCCA AGAATCTGAA 7080 

ATTATTCTTG CAGATGAACC AGTTGCTTCA TTAGACCCAT TAACTACGAA ACAGGTTATG 7140 

45 

GATGATTTAA GAAAAATCAA CCAAGAATTA GGCATCACAA TTTTAATTAA TTTACATTTT 7200 

GTTGACTTGG CAAAAGAATA TGGCACACGC ATCATTGGTT TACGTGATGG TGAAGTTGTC 7260 

SO TATGATGGTC CTGCATCTGA AGCAACAGAT GACGTATTTA GTGAAATATA TGGACGTACA 7320 

ATTAAAGAAG ATGAAAAGCT AGGAGTGAAC TAACATGCCT TTAGAAATAC CTACAAAGTA 7380 

TGACTCCCTT TTAAAGAAAA AGGTTTCTTT AAAAACGAGT TTTACCTTCA TGTTAATCAT 7440 
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AATACCTCAA ATAGGTGATC TATTCAAACA AATGATTCCA CCTGATTTCG AGTATTTACA 7560 

ACAAATTACA ACGCCAATGT TAGATACCAT TCGAATGGcT ATCGTAAGTA CAGTATTAGG 7620 

TAGCATCGTT TCAATACCAA TTGCGTTATT ATGTGCTAGC AATATCGTTC ATCAAAAGTG 7680 

GATTTCAATA CCCTCGCGCT TTATTTTAAA TATAGTTCGT ACTATTCCAG ATTTGTTATT 7740 

AGCAGCAATC TTTGTGGCTG TATTTGGAAT CGGTCAAATT CCAGGGATAT TAGCACTGTT 7800 

TATTTTAACT ATCTGTATTA TTGGAAAATT ATTATAT6AA TCATTGGAAA CGATAGATCC 7860 

AGGTCCAATG GAAGCAATGA CGGCTGTTGG CGCTAATAAA ATAAAATGGA TTGTTTTCGG 7920 

TGTTGTACCA CAAGCCATAT CGTCATTTAT GTCATACX3TA TTATATGCAT TTGAAGTAAA 7980 

TATACGTGCT TCAGCTGTGC TTGGATTAGT CGGCGCTGGC GGTATTGGAT TGTTTTATGA 8040 

TCAAACACTT GGTTTATTTC AATATCCAAA AACAGCAACG ATTATTTTAT TTACTTTAGT 8100 

TATCGTCGTC GTCATTGATT ACATCAGTAC GAAAGTGAGG GCACATCTCG CATGACACAG 8160 

GAAATAGCAA AATATAATGT TCACACAAAA GCACACAAAC GAAAATTGAT TAAAAGATGG 8220 

CTTATTGCAA TTGTCGTCTT AGCTATTATC ATCTGGGCAT TTGCAGGTGT ACCAAGTTTA 8280 

GAACTTAAAA GTAAATCATT AGAAATCTTA AAATCCATAT TCAGCGGATT ATTCCATCCT 8340 

GATATCAGCT ATATCTATAT ACCAGATGGC GAAGACTTAT TACGTGGTTT ACTTGAAACC 8400 

TTTGCGATAG CCGTTGTAGG TACTTTCATC GCCGCAATTA TCTGTATTCC ATTAGCATTT 8460 

CTAGGTGCAA ATAATATGGT AAAGCTAOGC CCAGTTTCAG GTGTTAGCAA ATTTATTTTA 8520 

AGTGTTATAC GTGTCTTCCC AGAAATTGTA ATGGCACTTA TATTTATCAA AGCTCTTGGC 8580 

CCAGGTTCAT TTTCAGGTGT ATTAGCTTTA GGTATCCATT CCGTAGtATG CTTGGGAAAC 8640 

TTTTAGCTGA AGATATTGAA GGTCTAGATT TCAGTGCTGT AGAATCATTA AAGGCCAGTG 8700 

GTGC25AATAA GATTAAAACA CTCGTATTTG CAGTCATACC ACAAATTATG CCTGCCTTTC 8760 

TATCACTCAT ACTTTATCGC TTTGAACTAA ACTTACGTTC AGCTTCTATA CTGGGGCTAA 8820 

TTGGGGCTGG TGGTATCGGG ACACCACTCA TATTTGCCAT TCAAACACGT TCTTGGGACC 8880 

GTGTAGGTAT TATATTAATC GGTTTAGTAC TAATGGTCGC AATTGTCGAT TTAATTTCCG 8940 

GTTCAATCCG AAAACGTATT GITTAACATT AAATCAGGAT ACTCCTAAAT AAGAAGTCCT 9000 

ACCGTCTTAC GTTTCTCTAT TATAATAAAA ACAGCAGTGA AGAAAACTAT TGTTATAGTT 9060 

AACTTCACTG CTGTTTTTAT AATATCTAAA TTTATTCTAT TTCAATTCCT TTAAATAACT 9120 

TTTACCGAAC TCTGGTAATG TTACGTTGAA ATTATCTGCT ATAGTTGCAC CGATAGAACT 9180 

GAATGTAGTA TCACTTTCTA GTGCATGACC ACCTTTAAAT TTCGGACTGT ACATAATTAC 9240 



512 



EP 0 786 519 A2 



TGTAATAATT ACTAAATCGT CTTCTTTTAA 
GAAATCTTTA ATTGCTTGTG CATAACCTGG 
^ AAAGTCTACT AAGTTTAAGA AGCTAATACC 

TTGATCCATA CCGTCCATGT TACTCTTCGT 
ATAAATGTCA TTAATTTTAC CGATGGCAAT 

10 

TAAGACAGTT TTACCAAAAG GTTTTAACGC 
GTTTCCTGGT TCACCAACAT ATGGACGTGC 
TTTTGTCAAC TCACGAACCT TTTCACAAAT 
TTCATGTGCA GCAATTTGCA ATACTGGGTC 
TTTCATTTGG TGCTCGCCCC ACTCATOGAT 
20 AACAACTTTA CGACCTGTCA TTTCTTCAAT 

AGGGTATACT TTAAAAGGTT GCATAATATT 

TGTATCTTTA CCAACTGAAG CTTCACTCAA 

25 

TGCATTTACT ACTGGTAATT TATCGATGTT 

AGTTTGATCG AAACCTTCTA AGGTATGTCT 

AGCTGCGTCT GGCGCTTCAC CAATACCTAC 

30 

AAATGGTCTT GTCATAGCTA TCACTCCCAA 
TTCTAAACCT TGCATAATTT GAACACCTGC 

^ AACCATTTTA TTGAAATCTT CTAAATTACG 

AGCACCTACT GTATCTTTCA TTAATTTAAC 
ACCTGTTGAA GTTTTAACGA AGTCCGCACC 

40 AATTTCGTCA TGGTCCAACA ATACCGTCTC 

AGCTTTAACC ACTGCTTCAA TGTCTTGTTG 
GCCGATGTTG ATGACCATGT CAATTTCATC 

45 

AAATGCTTTC GTTGCAGTTG TCGACGCACC 
CACCTCTGAA TCAGCTAGTC GCTCTGCTGC 
AGATTTAAAA TTGTATGctT TCGCTTCATC 

50 

CTCAGGCTTC AATAAAGTGT GATCTATATA 
TGTTATATAA TCTCTTTATT TAATTTTACT 

55 



GTTGCTAAAC AGTTCTGGCA AGCXSATCATC 9360 

TTTATCACGA CGATGACCGT ATAATGCATC 9420 

TGTGaAATCT TTCTTAACAA TTTTCATCAA 9480 

ACGAACCGCT TCTGTTACAC CTTCACCATC 9540 

AACATCATAA CCACCGTCTT TCAAATGATC 9600 

ATAGTCATGT GGATTAGATG TACGTGTAAA 9660 

GATAATACGA CCAATTAAAT ATTTAGGGTC 9720 

ATCATATAAC TCTTCTAATG GGATAATGTC 9780 

TGCACTTGTA TAAACAATTA AGTCACCAGT 9840 

AATTTGCGTA CCCGATGCCG GTTTGTTAGC 9900 

TTGTTGAATT AACTCTTCAG GGAATCCATT 9960 

TAATCCCATA ATTTCCCAGT GACCAGTCAT 10020 

TTTAGTATAG TATGCTTCTG GTTGTTCAAC 10080 

CCCTAGACCT AACTTTTCAA GGTTTGGTAA 1014 0 

TAAAGTATGT GAACCTTCAT CTTTAAAATC 10200 

TGAATCCATT ACGATTAAAT GTACACGATT 10260 

AATTTATATA TATTAGTAAT CTGAATCTGC 10320 

GCTCGCACCA ATACGTGTCG CACCTGCTTC 10380 

TACGCCACCT GATGCTTTTA CTTCTACATC 10440 

GTCTTCTGCA GTCGCACCGC CACCTGCAAA 10500 

AGCCGCTTTT GTTAATTCAC TCGCTTTTAC 10560 

AATAATCACT TTTACTGTGT GACCTTTCGC 10620 

TACATCATCA AAACGTCCAT CTTTTAATGC 10680 

TGCACCATTT TGAATTGCAT CTTCTGTTTC 10740 

TAATGGGAAT CCTATTACCG TACAAACGAG 10800 

ATATTTAACA TGTGTTGGAT TCACACATAC 10860 

GATGATTTGA TCGATTTGCG TACGTGTTGA 10920 

TTTCTCAAAT TTCATACTTA CTACTCCTCG 10980 

ATAAATACGA ATATATCTCG CGAATTTATA 11040 
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ATACTCATTA AACCTAAAAT AATTAAAATA ATACCGAAAT GTGAACTTAA TGCATCATTG 11160 

CCTGGGAAAT TTAATGCTTT AAAATCGATT AGAGCCGCAG CAATCGCAAT ACCTACAGAT 11220 

5 

ACCGCCACAT TAATAATTAA ATTATAAAAA CCAATAGCCA CACCTGTCAT ATTAAGATCT 11280 

ATTGTTTTAA TGGCTTCGTT AAGTAAAGGT GCATACATTA AAGCAAAGCT ACCTGCAAAG 11340 

AATATCATAG AAATGACGAA GATTGAAATG TGATTACCTA CTGCAAATGC AGGTAAAATC 11400 

AAGCTCAGTG CTATTAAAAT AATTGCTGTG ATAATCGCTT GTTTTGAATT CAGATATTCG 11460 

CCGATTTTAC CACTTAGTGC ACCAACAATG ACTGCTACTA TATAACCCGG TACTAATAAC 11520 

75 AGTGATGTTG TGTCTAGTTG CAGATGATAA ATTTGCTCCA TTATGAATGG GAACGTAAAA 11580 

ATATAACCCA ATTGGATAGC ATACATTACA AATACTATAA ATAAAAATGA AGCATAACGT 11640 

TTATTTTGGA AAAATGATTT ATTTACTAAT GGACGTTGCG CATTTTTAAT ATATAGCGCA 11700 

on 

AAAACGATAA TCGCAATTAA GGCACCAATC ATATATAACC AATTAAAGTT CGTAATAAAC 11760 
AGCATGACTG TTGTAGCAGG GGATCCTCTA GAGTCGAnCC TG 11802 
(2) INFORMATION FOR SEQ ID NO: 71: 

2S 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1196 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
3S CTAAAGAAGA TGCGAAACAA GATGTTGATA AACAAGTTCA AGCTTTAATT GACGAAATCG 60 

ATCAAAATCC AAATCTAACA GATAAGGAAA AACAAGCACT TAAAGATCGT ATTAATCAAA 120 
TAC&CAACA AGGTCATAAC GACATTAACA ATGCGATGAC AAAAGAAGCA ATTGAACAAG 180 
CAAAAGAACG TTTAGCGCAA gCATTGCAAG ACATCAAAGA TTTAGTGAAA GCTAAAGAAG 24 0 

ATGCGAAAAA TGATATTGAT AAACGTGTAC AAGCTTTAAT TGACGAAATC GATCAAAATC 300 
CAAATCTAAC AGATAAGGAA AAACAAGCAC TTAAAGATCG AATTAATCAA ATACTTCAAC 360 

4S 

AAGGTCATAA CGACATTAAC AATGCGCTGA CTAAAGAAGA AATTGAGCAG GCAAAAGCAC 420 
AACTTGCACA AGCATTGCAA GACATCAAAG ATTTAGTGAA AGCTAAAGAA GATGCGAAAA 480 
ATGCAATAAA AGCCTTAGCT AATGCGAAgc GTGATCAAAT CAATTCAAAT CCAGATTTAA 540 
CACCTGAGCA AAAAGCAAAA GCGCTCAAAG AAATTGACGA AGCTGAAAAA CGAGCACTAC 600 
AAAACGTTGA GAATGCTCAA ACTATAGATC AATTAAATCO AGGATTAAAC TTAGGTTTAG 660 
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TTGAAGCAAC ACCTGAGCAA ATCCTAGTTA ATGGTGAACT CATTGTACAT CGTGATGACA 780 

TCATTACAGA ACAAGATATT CTTGCACACA TAAACTTAAT TGATCAGCTT TCAGCAGAAG 840 

5 

TCATCGATAC ACCATCAACT GCAACGATTT CTGATAGCTT AACAGCAAAA GTTGAAGTTA 900 

CATTGCTTGA TGGATCAAAA GTGATTGTTA ATGTTCCTGT AAAAGTTGTA GAAAAAGAAT 960 

TGTCAGTAGT CAAACAACAG GCAATTGAaT CAATCGAAAA TGCGGCACAA CAAAAGATTA 1020 

10 

ATGAAATCAA TAATAGTGTG ACATTAACAC TGGAACAAAA AGAAGCTGCA ATTGCGnAAG 1080 

TTAATAAGCT TAAACAACAA GCAATTGGAT CATGTTnAAC AATGGCACCT GGATGTTCCA 1140 

TTCAGTTGAA GGAAATTTCA ACAACAAGGA ACAAGCGCCn GATTGGAACA ATTTGA 1196 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQXraaCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



• (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



30 



35 



SO 



CAATCGTTTC 


AACGCTATTA 


TCTTTAGACA 


ACAATTGTAA 


GCGTGTATGT 


GCAGTTTCTA 


60 


AACAGTCTAT 


AATTCGAGTT 


CTTAATTCAG 


CTGGATCATC 


TTTAAAAATA AAATCCATCG 


120 


CTGCAACTTT 


GTAGACAAAT 


GTTAAATAGG 


TAAGTTCACT 


GTGACTCGTA 


ACGAAAATAA 


180 


TGTTACCAAC 


TGGGTCATGC 


TTACGAATTT 


CACTGCCTAA 


TTTGATACCA 


TTAATATCAG 


240 


TTGAAAGTTG 


AATATCTAAA 


AAGTAACAGC 


CTATGTCATT 


CATATTTTTA 


GCTT6CTCAA 


300 


GCACCTCATA 


AGGATTATCA 


GTTGCGAGGG 


CAATTTCCAT AGGCTTTTCT 


TCTATCATTA 


360 


TATAATTTTT AATAATGGTA 


ACCATGTTTT 


CTCTTTGTTT 


TGGATCGTCT 


TCGCAAATGA 


420 


AAATTTTCAT 


ACATTCACAT 


CCTTATGGCT 


AGTTGTTAAT AATTTCAACT 


TTTTGAATAA 


480 


AGAAACCATT 


TTCGATAATT 


GTATCTAATA 


AGACATTGTC 


TGCATTATCA 


GCAATTTCTT 


540 


TTAAAGTTGA 


TAGACCTAAA 


CCACGACCTT 


CACCTTTAGT 


AGAAAAACTT 


TCTTGGAACA 


600 


ATTCATGAAT 


GCGTGGTATA 


TCATCAGCGC 


ATTTATTCAT 


AACAATAAAC 


GTTACTGAAT 


660 


TTTCACTTTC 


AATAAATGCA 


ACGCGAATGA 


TAGGGTCATC 


AATTTCAGTT 


GATGCCTCAA 


720 


TTGCATTATC 


AAGAATAATA 


CCAATACTGC 


GACTTAAATC 


GATCATATTC 


AAGTTAATGC 


780 


TACTTACTTC 


ATCGGGTATT 


TCGATACTAA 


TCGGAATATT 


CATTTCTTGT 


GCACGTAAAA 


840 


TTTTCGCAGT 


AATTAAGCCT 


TTAATTTCAC 


GTACTTTAAG 


ATTCTCGATA 


CCATTTAATT 


900 
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GTAC5GCCAGG CATGTCATCT TCTCX5AATGT ATTCTGAAAG TGTCGTTAAG ATATTGACAT 1020 

AATCATGACG GAACTTGCGC ATTTCGTTGT TGATAGCTTC AATCTTCAAT GTATATTCAT 1080 

S AATAGGTTTC AATTTCTTCT TGATTACGTT TATATTTCAT CTCTTTAAGG AGAAATTGAG 1140 

AAATAACAAA TGTTAATATA CTTAAAAATA TAGTGATACC AATAAAAATA AAAGAATACT 1200 

GCCTTATTAC TTTAGCTTCA TCCGAGTTTA TTTGTGAATA AAAGAAAAAT AATGAAAAAG 1260 

10 

TAAGCAGTAA GATAGTCGAA ATAACTATTA AAAATCCTTT GTTTAGTATT AGATATGGTG 1320 

TGCTAATTTT TTTGAGAACT CTATTTATTA TATATGAGAA TAGTATACTA ATAGTCACAT 1380 

AAACTACAAA AAAGCTAGGG AATATTACAA ATATACTATC AGAAATTTTG GTGGATATAT 1440 

IS 

6CATATATAA CTATATACCT GTAGTTAGCA CnGXtlATAGG AATAATCnGG CX5AGGTCCAT 1500 

AATCCACCAA AATAGAATA 1519 
2^ (2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

GTAGGAATCT CTTTGTCTTT TTGGGAGGAC ATTTAATATG AATGTATATT TAGCAGAATT 60 

CCTAGGAACT GCAATCTTAA TCCTTTTTGG TGGTGGCGTT TGTGCCAATG TCAATTTAAA 120 

GAGAAGTGCT GCGAATGGTG CTGATTGGAT TGTCATCACA GCTGGATGGG GATTAGCGGT 180 

35 

TACAATGGGT GTGTTTGCTG TCGGTCAATT CTCAGGTGCA CATTTAAACC CAGCGGTGTC 240 

TTTAGCTCTT GCATTAGACG GAAGTTTTGA TTGGTCATTA GTTCCTGGTT ATATTGTTGC 300 

^ TCAAATGTTA GGTGCAATTG TCGGAGCAAC AATTGTATGG TTAATGTACT TGCCACATTG 360 

GAAAGCGACA GAAGAAGCT6 GCGCGAAATT AGGTGTTTTC TCTACAGCAC CGGCTATTAA 420 

GAATTACTTT GCCAACTTTT TAAGTGAGAT TATCGGAACA ATGGCATTAA CTTTAGGTAT 480 

45 TTTATTTATC GGTGTAAACA AAATTGCCGA TGGTTTAAAT CCTTTAATTG TCGGAGCATT 540 

AATTGTTGCA ATCGGATTAA GTTTAGGCGG TGCTACTGGT TATGCAATCA ACCCAGCACG 600 

TGATTTAGGT CCGAGAATTG CACATGCGAT TTTACCAATA GCTGGTAAAG GTGGTTCAAA 660 

TTGGTCATAT GCAATCGTTC CTATCTTAGG ACCAATTGCC GGTGGTTTAT TAGGTGCAGT 720 

GGTATACGCT GTATTTTATA AACATACATT TAATATTGGT TGTGCAATTG CrATTGTTGT 780 
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CGAATCAATT TACTAAAATA AAAAGAAACG TAAATAGCAT AATTTAACAT GTTTGATTCA 900 

TGGATTATGC TATTTTTTCG CCAAAATTTA ACAGATTTTG TACAATGGGT TAGCGATTAT 960 

5 TTTTTAATAA AGGAGATACT ACTAATGGAA AAATATATTT TATCTATAGA CCAAGGAACA 1020 

ACAAGCTCAA GAGCGATTTT ATTCAATCAA AAAGGGGAAA TTGCAGGGGT AGCACAACGT 1080 

GAGTTTAAGC AATATTTTCC ACAATCAGGT TGGGTTGAAC ATGATGCAAA TGAAATTTGG 1140 

10 

ACATCTGTGT TAGCTGTAAT GACGGAAGTA ATTAATGAAA ATGATGTTAG AGCTGATCAA 1200 

ATTGCAGGTA TCGGTATTAC AAACCAACGT GAAACAACGG TTGTTTGGGA CAAaCATACT 1260 

GGCCGCCCAA TTTATCACGC AATTGTTTGG CAATCACGTC AAACACAATC AATTTGTTCA 1320 

75 

GAATTAAAAC AACAAGGATA TGAACAAACA TTTAGAGATA AGACAGGATT ACTTTTAGAT 1380 

CCX3TATTTTG CAGGTACAAA AGTTAAATGG ATTCTAGACA ATGTTGAAGG TGCACGAGAA 1440 

20 AAAGCAGAAA ATGGCGATCT ATTATTTGGA ACGATTGATA CTTGGTTAGT ATGGAAATTA 1500 

TCaGGaAAAg CtGCGCATAT TACTGATTAT TCaAATGCGA GTCGTACATT AATGTTTAAT 1560 

ATCCATGATT TAGAATGGGA CGATGAGTTA TTAGAACTAt TACAGTACCT AAAAATATGT 1620 

25 TGCCAGAAGT TAAAGCTTCG AGTGAAGTAT ATGGTAAGAC AATTGATTAC CACTTCTATG 1680 

GTCAAGAAGT ACCAATCGCT GGAGTAGCTG GTGATCT^CA AGCAGCATTA TTTGGACAAG 1740 

CTTGCTTCGA ACGTGGTGAC GTGAAAAACA CATATGGAAC TGGTGGCTTC ATGTTAATGA 1800 

30 

ATACAGGTGA CAAAGCGGTT AAATCTGAAA GTGGTTTATT AACAACAATT GCTTATGGTA 1860 

TTGATGGAAA AGTAAATTAT GCGCTTGAAG GTTCCATCTT TGTTTCGGGT TCAGCAATCC 1920 

AATGGTTACG TGATGGATTA AGAATGATTA ATTCAGCACC ACAATCAGi\A AGTTATGCGA 1980 

35 

CACGAGTTGA CTCTACTGAG GGTGTTTATG TTGTTCCAGC TTTTGTAGGT TTAGGAACAC 2040 

CATMTGQGA TTCTGAAGCA CGTGGTGCGA TTTTCGGTTT AACACGTGGA ACTGAAAAAG 2100 

AGCACTTTAT CCGTGCAACT TTAGAATCAC TATGTTACCA AACTCGTGAC GTTATGGAAG 2160 

CAATGTCAAA AGACTCTGGT ATTGATGTCC AAAGTTTACG TGTCGATGGT GGTGCAGTTA 2220 

AAAATAACTT TATTATGCAG TTCCAAGCAG ACATTGTTAA TACTTCTGTT GAAAGACCTG 2280 

45 AAATTCAAGA AACTACAGCT TTAGGTGCTG CATTTTTGGC AGGTTTAGCA GTTGGATTCT 2340 

GGGAGAGTAA AGATGATATC GCTAAAAACT GGAAATTAGA AGAAAAATTC GATCCGAAAA 2400 

TGGATGAAGG CGAAAGAGAA AAATTATATA GAGGTTGGAA AAAAGCTGTT GAAGCAACAC 2460 

SO 

AAGTTTTTAA AACAGAATAA ACTTGTAGAT TAGACTTTTG TATAAACATT GTGATACAAT 2520 

CAATTTAAGT TAATATTTGA ATCGAGAAGC GAGAGATTTG TTCGAACATG TACAATTGAA 2580 
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GCATTGTCTA CTTTTAAGAG AGAACATATT AAAAAGAATT TAAGAAATGA TGAATATGAT 2700 

TTAGTAATTA TTGGTGGCGG TATTACAGGT GCAGGTATTG CACTAGACGC GAGTGAAAGA 2760 

GGAATGAAAG TTGCATTAGT TGAAATGCAA GACTTTGCAC AAGGAACAAG CTCAAGATCT 2820 

ACAAAATTAG TCCATGGTGG TTTGCGTTAC TTPAPiACAAT TCCAAATTGG AGTAGTTGCC 2880 

GAAACTGGTA AAGAACGTGC GATTGTTTAT GAAAATGGGC CTCATGTTAC GACTCCAGAG 294 0 

TGGATGCTTT TACCAATGCA TAAAGGTGGA ACATTTGGTA AATTCTCAAC ATCAATTGGT 3000 
TTAGGAATGT ATGATCGTTT AGCAGGTGTT AAGAAGTCTG AACGTAAAAA AATGTTATCT • 3060 

AAAAAAGAAA CTTTAGCTAA AGAACCATTA GTTAAAAAAG AAGGTCTAAA A6GCGGCGGT 3120 

TACTATGTTG AATATCGTAC TGACGATGCG CGTTTAACTA TTGAAGTTAT GAAGCGTGCT 3180 

GCTGAAAAAG GCGCAGAAAT TATCAACTAT ACTAAATCTG AACACTTCAC TTATGATAAA 3240 

20 AATCAACAAG TAAATGGTGT TAAAGTTATA GATAAATTAA CTAATGAAAA TTATACAATT 3300 

AAGGCTAAAA AAGTGGTTAA TGCAGCAGGT CCATGGGTTG ATGATGTTAG AAGTGGTGAT 3360 

TATGCACGCA ATAATAAAAA ATTACGTTTA ACTAAAGGTG TACATOTTGT TATTGATCAA 3420 

TCAAAATTCC CATTAGGTCA AGCAGTATAC TTTGATACTG AAAAAGATGG AAGAATGATT 34 80 

TTTGCAATTC CACGTGAAGG AAAAGCGTAT GTAGGTACTA CAGATACATT CTATGACAAT 354 0 

ATCAAATCTT CACCATTAAC TACACAAGAA GACAGAGACT ATTTAATCGA TGCGATTAAT 3600 

TACATGTTCC CTAGTGTTAA TGTTACAGAT GAAGATATTG AATCAACATG GGCAGGAATT 3660 

AGACCATTAA TTTACGAAGA AGGCAAAGAC CCTTCTGAAA TCTCTCGTAA GGATGAAATT 3720 

TGGGAAGGTA AATCAGGTTT ATTAACTATT GCAGGTGGTA AATTAACAGG CTATCGTCAC 3780 

ATGGCTCAAG ACATTGTTGA TTTAGTATCT AAACGCTTGA AAAAAGACTA CGGTTTAACA 3840 

TTTA6TCCAT GTAATACAAA AGGTCTGGCA ATTTCAGGTG GCGATGTAGG TGGTAGCAAG 3900 

^ AACTTTGATG CGTTTGTAGA GCAAAAAGTA GATGTAGCTA AAGGATTCGG CATTGATGAA 3960 

GATGTTGCAA GACGTTTAGC ATCTAAATAT GGTTCAAATG TT6ATGAATT GTTCAACATT 4020 

GCGCAAACAT CTCAATACCA TGATAGCAAG TTACCATTAG AAATTTATGT AGAACTTGTT 4080 

^ TATAGTATTC AACAAGAAAT GGTATACAAA CCTAACGATT TCTTAGTTCG TCGTTCTGGT 4140 

AAAATGTATT TCAATATTAA AGATGTATTA GATTATAAAG ATGCTGTCAT CGATATTATG 4200 

GCAGATATGC TTGATTACTC TCCAGCTCAA ATTGAAGCAT ATACTGAAGA AGTTGAGCAA 4260 

SO 

GCAATTAAAG AAGCGCAACA TGGaAATAAT CAACCAGCAG TTAAAGAATA AtTAATTTGT 4320 

ACAATCATAA ACTGGTGTCC TGTTTTAAGG GCATCAGTTT TTTTATACGA GATACATTAG 4380 

SS 



2S 



30 



35 
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GTTATTAAAG GTGTGAGATG ATGACTGAAA AACAATTTAA ATTAACTGTA CAAGATAATA 4500 

CGAATATTGA AGTTAAAGTG AATTTTACAG ATGTAGATTC AAAAGGAATT ATTCATATAT 4560 

5 TTCATGGTAT GGCTGAACAT ATGGAACGTT ACGATAAATT AGCACATGCA CTTTCAAAGC 4620 

ATGGCTTCGA TGTGATACGT CATAATCATC GAGGACATGG TATTAATATT GATGAATCAA 4680 

CAAGAGGGCA TTACGATGAT ATGAAACGAG TTATCGGTGA TGCCTTTGAA GTAGCGCAAA 4740 

10 

CAGTGAGAGG CAATGTTGAT AAACCATACA TTATAATCGG ACATTCAATG GGATCCGTTA 4800 

TAGCTAGATT GTTTGTAGAA ACATATCCGC AATATGTTGA TGGTCTAATT TTAAGTGGTA 4860 

CTGGTATGTA TTCATTATGG AAAGGTTTAC CAACCGTTAA AGTGTTACAA CTGATTACAA 4920 

IS 

AAATTTATGG TGCTGAGAAA CX3AGTTGAAT GGGTTAACCA GTTAGTATCA AATAGTTTTA 4980 

ATAAAAnnAT ACGTCCATTA CGTACACAAA GTGATTGGAT TTCTAGTAAT CCAATTGAAG 5040 

2Q TAGATAaCTT TATTAAAGAT CCATATAGTG GaTTTAATGT GTCA/UITCAA TTATTATATC 5100 

AAACAGCCTA TTATATGCTA CATACATCAC AATTAAAAAA TATGAAAATG TTAAaTCATG 5160 

CCATGCCTAT ATTATTAGTT TCAGGATATG ACGATCCTTT AGGTGATTAT GGTAAAGGGA 5220 

2S TTTTAAAATT GGCGAATATA TATAGAAACG CTGGCATnAA AAATGTTAAA GTGAATCTTT 5280 

ATCATCATAA ACX5TCATGAA GTGTTATTTG AAAAnGATCA TGACnAAATT TGGGAAGACT 5340 

TGTTTAAATG GTTGAATCAA TTTTATAAAA AATAAAGAAA GTGGAATTAA ATATGAATAA 5400 

30 

AAATAAGCCT TTTATTGTAG TAATTGTGGG GCCAACTGCT TGCAG 5445 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2569 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDSDNESS : double 
^ (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



TGGCTTGAAC TACGCCAATA AGTCCCCCTA GTACAAGAAT GAATACCATG ATATCGACCG 60 

45 CTTCTATCGT ACCTTCAACC ATGCTACTTG TTATTTGTTC TGGTCCAGCT GGATGTTGCT 120 

TTAATCTTTC ATAAGTATTC GGAATTGATA CCGGCTTATT AATTGCACCT GATTTAAATT 180 

GTTCAATCTT AATTTTAACC CCCATTTTGT CTAGTTCCTG TTGCGTACCC GGAACCTTTT 24 0 

SO 

TCACTTGGTT ATGAGGGTTA ACTATCTTTA GTTCTTGGGA TGAAGGTTCG TAAGAAAGTT 300 

TAGAATATGC ACCAGCAGGA ATAACCCATG TTGCTATAAC TGCAACAACC GTTAAAATGA 360 
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TAATTGTATT TTCCACGGTT TCATCTCCTT 
GATTTTATAA ATATAAATTA AGAAAGTGCA 

5 

GGGGGTGCAC ATAAATAATA AAAATCATGC 
AATTCAATTA CTTTTTAATC ACAGTACCTA 
TTAATGATGT TATAAGCACA CTTCCTTTTG 

10 

CAATTTTTGG TAACATACTT CCTTTTGCAA 
CATCAACATT TGTTGTTTTC AAAGGCTGTT 
AATCAATTGC TGTTAAAATA ATCAATTGAT 
TTGTTTTATC TTTGTCTATA ACTGCATCAA 
TTACTGGTAT ACCTCCACCA CCAGCAGCAA 
20 TAATACTCTC TAATTCAATA ATAGAGATGG 

CTCTTCCAGC ATCTTCAACA AATATAAATC 
CTTTGTTGTA AAATAACCCA ATTGGTTTTG 
CAACTTCAAC TTGTGTCACT AGTGTTACCA 
CATTTTGTAA GCTTTCTTGT AATTGATAGC 
CAGCAAATGG AAATGCCGGA CCTTGGTTAT 

30 

TGCTTCCAAC CTGTGGTCCA TTACCATGAC 
AyCCTACTAA TGATTtCGCA GTATTTTTAA 
CCTAAAGCAT TACCACCTAA TGCTACTACT 

35 

CATTTAAAAT TCACCCAATG TAGCAACCAT 
AGCTTCTTGQ AATACAACTG AAGCTTTACT 
40 TCGAATACCA TATTTTTCAA AAATTTGTTG 

TGGTAAGCAA TGCTCAAAAA TAACATTTGG 
TACTTGATAT GGTTTCAATA ATTCAAGTCG 
TGATACCCAA ACATCAGTGT AAATTACATC 
TGTGATTAAT ATGTTGCCaC CATTTTCaGC 
ATCTGTTGGA TTTAATTCTT TTGGACAAAC 

SO 

ACCTTGCATT AATGCATTTG CAACGTTATT 
ATCTGCATAA TCTTTTTTTA AGACTTCTTT 

55 



CGACATTTAA CCTAGCATTT CTACCTTAAA 480 

CCCCGCATCA AAATAGAGGC ATTATTTTCA 540 

ATTTGACATA TAGTAATTGA AAAGCGTTTC €00 

CTTTACCCTC TAAGGCAGCA TCTAATTCAT 660 

GATTGTTTTC AATAAATGAT ATGGCTGCTT 720 

ATTGATTTTC GTCTATATAT CGTTTTAATT 780 

GGTTTTCAGT GTTAAAATTA ATATATACAT 840 

C6CATTGAAT ATTA6CACCC AACAACGCAC 900 

TACCTTTAAA ACCATCATGT TGCTCTCTAA 960 

TAACGAGTGT ATCATTTTTA ATAAGTGTTT 1020 

GTTGTGGTGA AGGAACAACG CGTCTATATC 1080 

CTTTTTCTTT TTGAATTTGT TCAGCTTCTT 1140 

AAGGATTGTT AAATGCCGGA TCATTTTCAT 1200 

CTTGTTTATC CATTCCAATA GAATGCAATT 1260 

CGATGTAAGC TTGACTCATT GCGCCACATT 1320 

GTTCTGCAGC ATAGTTAAGT CCCAAATTAA 1380 

TAATAACAAT CTCATGTCCT TTTGTnATTA 1440 

CAAGCTCGAG TtGgTyCTTG aGGTGATTTn 1500 

ATTTTCGCCA TCATATTCAC TTCCTTATAT 1560 

GaCTGCTTTG ATTGTATGCA TTCTGTTCTC 1620 

TTCGAATACT TCATCTGTAA CTTCCATTTC 1680 

ACCTATTTTC GTATCAGCAT TATGGAAAGA 1740 

ATTACCAGTT TTATCCATTA TTTCTTTATT 1800 

TTCTTTCCAT ACTTCATCAG GTTCACCCAT 1860 

CGAACCTTTT ACaCCTTGGT CaATATCATC 1920 

GGCAATATTT TTACAGCGAT TTAATAATTC 1980 

TAAATGGAAG TTCATACCCA TAATGGCAGC 2040 

ACGACCATCT CCAACATATG TAAAGTTAAT 2100 

TGCTGTTAAG AAATCAGCAA GAACTTGAGT 2160 
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TTCTACTGTT CTTTGTGAAA AACCACGGTA TTCAATGCCA TCATACATTC CACCAAGCAC 2280 

ACGTGCAGTA TCTTTAGTTG TTTCTTTTTT ACCCATTTGT GATCCAGTTG GGCCTAAATA 2340 

5 AGTTACATTT GCACCTTGAT CATGCGCTGC AACTTCAAAT GCACATCGCG TTCTTGTAGA 2400 

ATCTTTTTCA AATAACAGTG CAATATTTTT ATTTTTTAAC ATAGGCTTTT CAGTGCCAAT 2460 

ATATTTAGCA CGTTTTAAAT CCTCGGAGAG TGTTAATAAG GTTCTACCTC TTGTCGTGAA 2520 

10 

AAGTCTAATA AAGTTAAAAA ACTTCTGTTT CGTAnATTTT TCATTAAnA 2569 
(2) INFORMATION FOR SEQ ID NO: 75: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



2S 



30 



35 



40 



CCTGGAACCA 


TCCaATCGtG 


CaAATCtTGa 


AAGaGAATAC 


GCAACAACAA 


TTAAATGTAT 


60 


TGGAACACTA 


TATTCCAAAT 


GACCATCCAG 


CACTCGTTGA 


ATTAAAAATA TGGGAACGTT 


120 


GGTTACATAA 


ACAAGGTTAC 


AAAGACATCC 


ATTTAGATAT 


TACTGCGCAC 


CACCTAGATC 


180 


CTATTACACA 


GGTTTATTTA 


TTCAATGTCA 


TTTTGCTGAA 


AATGAATCTC 


GAGTTTTAAC 


240 


AGGTGGTTAT 


TACAAAGGAA 


GCATCGAAGG 


GTTTGGATTA 


GGATTAACAC 


TTTAAGTAAG 


300 


GGAGTATGCA 


CAATGTTAAG 


AATCGCCATA 


GCCAAAGGAC 


GTCTAATGGA TAGTTTAATT 


360 


AACTATTTAG 


ATGTAATTGA 


ATATACGACA 


TTATCAOAAA 


CATTAAAAAA 


TAGAGAACGC 


420 


CAATTATTAT 


TAAGTGTAGA 


TAATATTGAA 


TGCATTTTAG 


TAAAAGGAAG 


TGACGTGCCA 


480 


ATCTATGTGG 


AACAAGGAAT 


GGCAGACATA 


GGCATTGTTG 


GTAGCGACAT ATTAGATGAG 


54 0 


CGCCAATATA 


ATGTTAATAA 


TTTGTTGAAT 


ATGCCTTTTG 


GAGCATGTCA 


TTTTGCGGTT 


600 


GCAGCGAAAC 


CTGAAACGAC 


CAATTATCGT 


AAAATCGCAA 


CGAGTTATGT 


TCATACTGCT 


660 


GAAACATATT 


TTAAATCAAA 


AGGTATTGAT 


GTC6AATTGA 


TTAAATTGAA 


TGGCTCTGTT 


720 


GAATTGGCCT 


GTGTTGTAGA 


TATGGTAGAC 


GGAATTGTCG 


ACATCGTTCA 


AACAGGTACT 


780 


ACGCTAAAAG 


CGAACGGACT 


GGTTGAAAAG 


CAACATATTA 


GTGATATCAA 


TGCAAGATTA 


840 


ATAACTAATA 


AAGCAGCTTA 


TTTTAAAAAA 


TCACAATTAA 


TAGAGCAATT 


TATTCGCTCT 


900 


TTGGAGGTGT 


CTATTGCCAA 


TGCTTAATGC 


ACAACAATTT 


TTAAATCAAT 


TTTCATTAGA 


960 


AGCACCATTA 


GATGAGTCAT 


TGTATCCaAT 


TATTCGCGAT 


ATTTGTCAGG 


AAGTTAAAGT 


1020 
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TTTAGaAATT AGTCATCSAmC AAATTAAAGC AGCATTTGAC ACATTAGATG AAAAAACAAA 1140 
ACAAGCATTA CAACAAAGTT ATGAAAGAAT TAtlAGCATAT CAaGAAaGTA TtaAACAGaC 1200 
5 GaATCAACAG TTAGAAGaAT CAGTGGaGTG tTrTGaAATA TACCATCCmC taGaAAGTGT 1260 

CGGTATTTAT GTG 1273 
(2) INFORMATION FOR SEQ ID NO: 76: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 08 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

75 

(xi) SEQX7ENCE DESCRIPTION: SEQ ID NO: 76: 

2^ GTTGATAAAT TAAAAATGTT TTTATCAGAT ATTCAAAGTT ACCAACAATA TAGTAAAGAT 60 
CATCCGGTGT ATCAGTTAAT TGATAAATTT TATAATGATC ATTATGTTAT TCAATACTTT 120 
AGTGGACTTA TTGGTGGACG TGGACGAOGT GCAAATCTTT ATGGTTTATT TAATAAAGCT 180 

2S ATCGAGTTTG AGAATTCAAG TTTTAGAGGT TTATATCAAT TTATTCGTTT TATCGATGAA 240 
TTGATTGAAA GAGGCAAAGA TTTTGGTGAG GAAAATGTAG TTGGTCCAAA CGATAATGTC 300 
GTTAGAATGA TGACAATTCA TAGTAGTAAA GGTCTAGAGT TTCCATTTGT CATTTATTCT 360 

30 

GGATTGTCAA AAGATTTTAA TAAACGTGAT TTGAAACAAC CAGTTATTTT AAATCAGCAA 420 
TTTGGTCTCG GAATGGATTA TTTTGATGTG GATAAAGAAA TGGCATTTCC ATCTTTAGCT 480 
TCGGTTGCAT ATAGAGCTGT TGCCGArAAA GAACTTGTGT CAGAAGAAAT GCGATTAGTC 540 

35 

TATGTAGCAT TAACAAGAGC GAAAGAACAA CTTTATTTAA TTGGTAGAGT GAAAAATGAT 600 
AAATCATTAC TAGAACTAGA GCAATTGTCT ATTTCTGGTG AGCACATTGC TGTCAATGAA 660 
CGATTAACTT CACCAAATCC GTTCCATCTT ATTTATAGTA TTTTATCTAA ACATCAATCT 720 

40 

6CGTCAATTC CAGATGATTT AAAATTTGAA AAAGATATAG CACAAATTGA AGATAGTAGT 780 
CGTCCGAATG TAAATATTTC AATTGTGTAC TTTGAAGATG TGTCTACAGA AACCATTTTA 840 
45 GATAATGATG AATATCGTTC GGTTAATCAA TTAGAAACTA TGCAAAATGG TAATGAAGAT 900 
GTTAAAGCAC AAATTAAACA CCAACTTGAT TATCGATATC CATATGTAAA TGATACTAAA 960 

AAGCCCTCAA AACAATCTGT TTCTGAATTG AAAAGACAAT ATGAAACAGA AGAAAGTGGC 1020 

SO 

ACAAGTTACG AACGAGTAAG GCAATATCGT ATCGGTTTTT CAACGTATGA ACGACCTAAA 1080 

TTTCTAAGTG AACAAGGTAA ACGAAAAGCG AATGAAATTG GTACGTTAAT GCATACAGTG 1140 
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GATGGATTAA TCGATAAACA TATTATCGAA GCAGATGCGA AAAAAGATAT CCGTATGGAT 1260 
GAAATAATGA CATTTATCAA TAGTGATTAT ATTCGATATT GCTGAAGC 1308 



9 


(2) INFORMATION FOR SEQ ID NO: 77: 






70 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1431 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : double 

(D) TOPOLOGY: linear 






IS 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 








GATGCCATTn ATnnGTATGC AAGAAGTTGT TCCGGGTTCA GGTGGATTaC 


CAGTTGGTAC 


60 




TGGTGGTAAG ACGTTACTAA TGCTTTCAGG CGGTATAGAC TCACCAGTTG 


CTGGGATGGA 


120 


20 


AGTGATGAGA CGTGGCGTAA CAATTGAAGC GATTCATTTC CATAGTCCAC 


CATTTACAAG 


180 




TGATCAAGCA AAAGAAAAAG TTATTGAATT GACACGTATT TTAGCTGAAC 


GTGTTGGACC 


240 




AATTAAATTG CATATTGTAC CATTTACAGA ATTGCAAAAA CAGGTAAATA 


AAGTTGTACA 


300 


25 


TCCAAGATAT ACAATGACTT CAACGAGACG TATGATGATG CGTGTTGCTG 


ATAAATTAGT 


360 




ACATCAAATA GGGGCTTTAG CTATTGTAAA TGGTGAAAAC CTAGGGCAGG 


TAGCCAGTCA 


420 


30 


AACACTTCAT AGCATGTATG CAATTAATAA TGTAACTTCT ACTCCTGTAT 


TACGTCCTTT 


460 


ATTAACTTAC GATAAAGAAG AAATTATTAT TAAATCGAAA GAAATTGGTA 


CATTTGAMC 


540 




ATCTATTCAA CCATTTGAAG ATTGTTGTAC AATTTTCACC CCTAAAAATC 


CAGTAACCGA 


600 


35 


ACCAAACTTT GATAAGGTAG TCCAATATGA AAGTGTCTTT GATTTTGAAG AGATGATTAA 


660 




TCGTGCTGTT GAAAATATTG AAACACTTGA AATAACTAGT GATTATAAAA CTATTAAAGA 


720 




ACAGICAAACA AACCAATTAA TAAACGACTT TTTATAAATA AAATCCTAGA GTAAATTTAA 


780 


40 


ACATAAGGGG ATGTTAAACT ATGGATTTGA ACTTAACGAT GATTATAATC ATAATTTTAT 


840 




TTGGTTTTAT CGCGGCGTTT ATAGATTCGG TTGTAGGGGG TGGCGGTTTA 


ATTTCTACGC 


900 




CAGCATTATT AGCAATCGGT CTACCACCAT CTGTGGCTTT AGGTACAAAT 


AAATTGGCAA 


960 


45 


GTTCGTTTGG TTCTTTAACT AGTACGATAA AGTTTATAAG GTCCGGTAAA 


GTGGACTTAT 


1020 




ATGTTGTTGC CAAATTATTT GGTTTTGTAT TTTTGGCATC TGCATGTGGC 


GCATATATTG 


1080 


SO 


CAACGATGGT TCCGTCACAA ATATTGAAAC CTTTAATCAT CATTGCACTT 


TCGTCGGTGT 


1140 


TTATATTCAC ATTACTTAAA AAAGATTGGG GCAATACACG CACGTTTACT 


CAATTTACAT 


1200 




TTAAGAAAGC CATAATATTT GCAGCACTTT TTATATTAAT CGGCTTTTAT 


GATGGATTTG 


1260 



ss 
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TAAGTGCAGC AGGAAATGCT AAAGTTTTGA ACTTTGCTTC TAATATAGGT GCGCTTGTAT 13 80 

TATTTATGGT ATTAGGACAA GTAGATTATG TAATAGGTTT AATTATGGCT A 1431 

5 

(2) IKFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4403 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

75 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



20 



25 



30 



35 



50 



AATATTATTT 


TAAATTCAAT 


ATTTATTGGT 


GCATTTATTT 


TAAACTTATT 


ATTCGCCTTT 


60 


ACCATTATTT 


TCATGGAAAG 


ACGTTCTGCC 


AATTCTATCT 


GGGCTTGGTT 


ACTAGTCTTA 


120 


GTTTTCTTGC 


CTTTATTOGG 


CTTCATTTTA 


TACTTACTAT 


TAGGACGACA 


AATTCAACGT 


180 


GACCAAATTT 


TCAAAATTGA 


TAAGGAAGAT 


AAAAAAGGAT 


TAGAGTTAAT 


CGTTGATGAG 


240 


CAATTAGCTG 


CTTTAAAAAA 


TGAAAACTTT 


TCAAATTCCA ATTATCAAAT 


TGTAAAATTT 


300 


AAAGAAATGA 


TTCAAATGTT 


GTTATATAAT 


AACGCAGCAT 


TTTTAACAAC 


AGACAACGAT 


360 


TTArrrrtAT 


ACACAGACGG 


CCAAGAAAAA 


TTTGATGACC 


TAATACAAGA 


CATCCGTAAT 


420 


GCTACTGATT 


ATATTCATTT 


TCAGTACTAT 


ATTATTCAAA 


ATGATGAATT 


AGGTCGTACC 


480 








CAAG6TGTAG AAGTTAAAAT 


TCTTTATGAT 


b4U 


GACATGGGTT 


CTCGTOGACT 


GCGTAAAAAA 


GGCTTACGCC 


CGTTTCGCAA 


TAAAGGTGGA 


600 


CATGCTGAAG 


CATTTTTCCC 


ATCAAAATTA 


CCTTTAATTA ACTTGCGTAT 


GAACAATCGA 


660 


AACCATCGAA 


AAATT6TTGT 


AATAGAT6GG 


CAAATTGGAT 


ATGTTGGTGG 


TTTTAATGTT 


720 


ggtc£tgagt 


ACTTAGGTAA 


ATCAAAAAAA 


TTCGGCTATT 


GGCGAGATAC 


GCATTTACGA 


780 




ATGCAGTGAA 


TGCATTGCAA 


TTACGATTTA 


TTCTAGATTG 


GAATTCACAA 


840 


GCCACACGTG 


ACCACATCTC 


CTATGATGAT 


CGTTATTTCC 


CAGATGTAAA 


TTCTGGTGGA 


900 


ACAATTGGCG 


TTCAAATAGC 


TTCTAGTGGT 


CCTGACGAAG 


AATGGGAACA 


GATTAAATAC 


960 


GGCTATTTGA 


AAATGATTTC 


ATCTGCTAAA 


AAATCGATTT 


ATATTCAATC 


TCCCTATTTC 


1020 


ATACCTGATC 


AAGCCTTTTT 


AGATTCTATT 


AAAATTGCGG 


CATTAGGTGG 


TGTTGATGTC 


1080 


AATATCATGA 


TTCCTAATAA 


ACCTGACCAT 


CCGTTTGTTT 


TTTGGGCTAC 


TTTAAAAAAT 


1140 


GCAGCATCCT 


TATTAGATGC 


CGGTGTTAAA 


GTATTTCACT ACGACAATGG 


CTTTTTACAC 


1200 


TCAAAAACAC 


TTGTTATAGA 


TGAT6AAATT 


GCAAGTGTGG 


GAACAGCTAA 


TATGGACCAT 


1260 
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AAATTAAAAC AAGCTTTTAT AGATGATTTA 
TATGCTAAGC GAAGTCTTTG GATTAAATTT 

5 

ATCTTATAAA ATAGAAATAT GAGGAGTGTA 
GCTGCCAAAA AATATATGGA ATCTATTCAT 
CATGTATATC GTGTCACTGC TTTAGCTAAA 

W 

ACTTTAGTCA TTGAACTCGC ATGTTTGCTT 
GCTAACAAAC AATATGTTGA ATTGAAGTCA 
GATCAAGAGC ACATTTTATT TATTATTAAT 
CATGTCACTT TATCTTTAGA AGGTCAAATT 
GGCGCTATAG GTGTTGCACG AACATTTCAA 
20 ACAGAACATA TGTCACTAGA TAAGATTAAT 

GCAATTAAAC ATTTCTTTGA AAAATTACTT 
GCGAAGATGA TTGCTAAAGA ACGTCACGAC 
ACGGAATGGA ATTGTCACGA CTAGACATTG 
CGTGTTGTTG TGGAAGCTTG GTGTCATGCC 
TGGTGACATG TCATGCTACT TTGATGTGCT 

30 

TGATGTGGCA TTGCGGTGTT ATGGTGTTAT 
TTGATGTGCT GGTACCACGA TGCGACTTGA 
ATGGTGTTAT AGACCGGTTT GATGTTGATG 

35 

TGCGACTTGA TGTAGTGCTA TGATGTGGCG 
GGTG3TGATG TCATGCCGTT ACGATTCTAT 

40 TATGCCGTTG TGACGTTATT ATTTCACACT 
TTTGCGACAT ATACTGCTAC ACTGATGAAT 
ATGACAACTC TGTTATTAAC CACTTTTTAC 

^ TAAAAACAGC AGTAGGATGA CTTTCACATT 
CACATATTGT ATAATGTGAC ACTAAGTTTC 
ATAAAGTTAA AATTATCTTC AACTTTTAGG 

SO 

TTCTTTTTCT TTTTAGACAC AACTTGTGTG 
TGCTCTCTTT CATACGCTTC AATGAAAGGT 

55 



GCAGTATCTT CTGAATTAAC 


AAAAGCACGT 


1380 


AAAGAAGGTA TTTCACAATT 


ATTGTCACCT 


1440 


aCTTTAATGC AACAATCAGA 


CGTCATTAGT 


1500 


CAAAATGATT ATACAGGCCA 


TGATATTGCG 


1560 


TCAATCGCTG AAAATGAAGG 


TGTTAATGAT 


1620 


CATGATACCG TTGACGAAAA 


AGTTGTAGAT 


1680 


TTTTTATCTT CTTTATCACT 


ATCAACCGAA 


1740 


AATATGAGCT ATCGCAATGG 


CAAAAATGAT 


1800 


GTCAGGGATG CAGATCGTCT 


TGATGCTATA 


1860 


TTTGCAGGAC ACTTTGGTGA 


ACCTATGTGG 


1920 


GATGATTTAG TTGAACAGTT 


GCCACCATCT 


1980 


AAGTTAGAAT CTTTAATGCA 


TACAGATACG 


2040 


TTTATGATGA TGTACTTGAA 


ACAGTTTTTT 


2100 


AAGTTGTAGT ATGATGATGC 


GATGTAATGG 


2160 


ATGTTACTTT GATGTGTTGT 


TGTGGGA6CT 


2220 


GGTACCACGA TGCGTCTTGA 


TGTAGTGCTA 


2280 


AGACAGGTTT GGCGTTGATG 


CCATGTTACT 


2340 


TGTAGTGCTA TGATGTGGCA 


TTGCGGTGTT 


2400 


CCATGTTACT TTGATGTGCT 


GGTGCTACGA 


2460 


TTGCGCTGTT ATGGTGTTAT AGCCAGGTTT 


2520 


GATATGTTGT TGGGACGTTG 


CAATGTGTAT 


2580 


GTTACATGTA TAAGTGAATT 


GCTGTGGAAA 


2640 


CATTGTGTCA AGATGACATT 


GCGATGAAGA 


2700 


ATACTGAAAA CTCGTTAATA 


TTATTTCAAA 


2760 


TGAAATCATC TTACTGCTGT 


TTCTATTTAT 


2820 


GCTATTGAAG CGAAAAATAA 


TGTGCGCCCT 


2680 


GTGCACATTA TTTGGACTTG 


CTAAGGTTAT 


2940 


TTTTTGCCTT TTTTATTGCt 


GCOGCCGTTG 


3000 


TGTACTTCTT TTTTAGCGAC 


TTTTTCATAA 


3060 
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IS 



CCAAGTGCTO ATXXTGAGCT TAATGAAATC CAGATAATCA TAATTGGTGA AATGACCATC 3180 

ATCATGTAAC CCATTTGACG TTGTTCGTCT GGCATCGTTT TACTTGATAC ATATGCTTGG 3240 

ATAAAGTATA AAACACCGGC AATAATTGTA ATCCAAATAT CAGGACGTCC TAAATCGAAC 3300 

CATAAGAAGT GTGGATATTT AAACAAACCA TCTACAAGTT GGTCTTTAAG TACAAAGTAT 3360 

AATCCCATGA TGATTGGTAA TTGGATTAGC ATTGGTAAAC AACCCAACAT ACTCTTAATC 3420 

GGGTTCATGT CATACTTTTT ATATACTTGC ATTAATTCTT GGTTTGCAGC CATTTTTTCT 3480 

TCTTGTGTAC GCGnCaCGTT cACTTTTTCT TGAATTTTTT CAACTTCTGG CTTTGCAACT 3540 

TTCATTTTTT GACGCATCAT ATC5ACTATTT TTATAGTTTG ACAACATGAA TGGTAATAAA 3600 

ATAATACGAA TTACCAATAC AAGGATAATA ATAGCTAAAC CATAATTGTC GTTTAATAAG 3660 

TTATTTCCCA ACCAATCX7A TACATTTTTC ATTGGATCTA CGAATGTATT GTAGAAAAAy 3720 

20 CwCtACGTTT TTCAGGTTTA GAATAGTCAC AACCAGCCAA AAAGACCATA ATACCTAAAA 3780 

ATAATGGTAG TAACGCTTTT TTCTTCATTT TTCCACCTCT ATCATTATAT TCACATAGGA 384 0 

TTTATTCTAT CACATTAATG AGTACGTATG AAACAATAAG TGGAAAAATT TAACTAATTA 3900 

TTAAAAAAAT CTTTGAATCG ATTAACAGTC TTTTCAATAT TTTCACTTTT AGAAATGGCT 3 960 

GAAATGACTG AAATTCCATT GGCACCTGCT TCTACAATCG GCGCCACATT ATTAGTATTG 4020 

ATACCGCCAA TAGCTACAAT CGGTAGTTGC GGATTCATTT CTTTAAACGT TGCAATCATT 4080 

TCTGGACCTA CTGGTATATG CGCGTCATGC TTCGACGGCG TAGGATAGAT TGGTCCAACA 4140 

CCTATATAAT CmACATGAGT TAAATCAGAT TTTGCATACT CATCTAAATC ACTAATACTA 4200 

AGTCCAATAA TTTTATCAGT GAAATATTGT GCTATCTCTT TGACTTTCGC ATCATCTTGA 4260 

CCGACATGTA TACCATCCGC GTTAATTTCT TTTGCCAAGG ATACATCATC ATTAACGATA 4320 

AAAGGCACAT CATATTGATG ACAGAGATGC TGTAATTCTT TAGCTAATAC AAGTTTATCG 4380 

40 TTTCCTTTTA AAGCTGATTC ACC 4403 
(2) INFORMATION FOR SEQ ID NO: 79: 



25 
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35 



45 



SO 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1808 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
TGGAnCCAAT ATTAGAAATG ATTAAAACAT TAACAGGTAT TAATAGTCCT TCAGGAGnCA 60 
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TAACAAATAA AGGTGCGTTA TTAATAACAG TGCCAGGCAA AAATGATGAA GTACAACGCT 180 

GTATTACTGC TCATGTTGAT ACTTTAGGTG CaATGGTTAA AGAAATTAAA GAAGATGGTC 240 

GCTTaGCAAT AGAATTAATT GGAGGATTCA CGTATAACGC GATTGAGGGT GAATATTGCC 300 

AAATTAAAAC TGATGCTGGT CAAATATATA CAGGAACAAT TTGTCTGCAT GAAACAAGTG 360 

TTCATGTATA TAGAAATAAT CATGAAATAC CTAGAGATCA AAAGCATATG GAAATAAGAA 420 

TTGATGAAGT AACTACATCA GAAGAAGATA CAAAGAGTTT AGGTATTTCA GTAGGTGATT 480 

TTGTTAGCTT TGATCCACGT ACAOTTATCA CGTCATCAGG TTTTATTAAA TCTCGTCATT 540 

75 TAGATGATAA AGCTAGCGTA CGgTtGATAC TACAATTACT AAAGAAATTA AAAGAAGAGC 600 

AAATAATATT ACCACATACA ACGCAATTTT ATATTTCTAA TAACGAAGAA ATAGGTTACG 660 

GTGCAAATGC ATCAATTGAT TCGAAAATCA AAGAATATAT TGCATTAGAT ATGGGCGCX3T 720 

20 TGGGAGACGG TCAAGCATCG GATGAATATA CAGTTTCTAT TTGTGCCAAA GATGCTTCAG 780 

GTCCATATCA TAAGCAATTG AAATCGCACC TAGTTAATCT TTGCAAAATA AATAACATTC 840 

CATATAAAGT AGACATATAT CCATATTATG GTTCAGATGC TTCAGCAGCT TTACATGCTG 900 

GTGCGGATAT CAGACATGGT TTATTTGGCG CTGGCATTGA ATCATCTCAT GCAATGGAAC 960 

GAACACATAT TGATTCTATT AAAGCGACAG AGAAATTACT ATATGCATAT TGCTTATCAC 1020 

CAATTGAGTA AACAATTAGT GTTGACAAAT GTGaACGACC TATGTAATAT AATGAACTAT 1080 

AAAAATAATT AGAATTTTCT AAAGAAATAG TAGCAGATAT GAAACGTAGC AAATAGAAAG 1140 

CTAATGGGTG ATGGGAATTA GCACGCCATA TCTTGTGAAT TGGACTTTGG AAAACAATTG 1200 

AATGAGTTTT GAAAGTGAAC ATGAATTATG TTAACTAAGG TGGCACCACX3 GTAACGCGTC 1260 

CTTACAGGTA TATGCGTTAT GTGGTGTCTT TTTATTTAGA CAAAATGTAG TAGTTAATTA 1320 

AAGGTAGCAA CAGAAAGTTA GTGGATGATG TGAACTAACA CCQAGATTAA TGAAATTGGG 1380 

40 TTTTGTCTGC AACAGAAAAA TTATATATAG TAAAGAGTGA ACTATGAATA TTTCGAATAT 1440 

TCGGTTAATT TAGGTGGTAC CACGCGTCAc nTCCTTTATA TTGATAAGGA TGCTGGCGCT 1500 

TTTTTGAAAG GAGCGTATAG AATGGATATA TTTTATAAAA AAATAAAAGC AAATGTAACG 1560 

CCCGAAGTTT TAGCACAACT TCATTCCAAG AAGaTCATTT TGGAAAGTAC AAATCAACAA 1620 

CAAACTAAAG GTCGCTATTC AGTTGTTATT TTTGATATTT ATGGCACTTT AACTTTAGAT 1680 

AATGATGTAT TATCAGTAAG TACTTTAAAA GAATCGTATC AAATCACTGA AAGACCGTAC 1740 

CATTATTTAA CGACTAAnAT AAATGAAGAC TACCATAATA TTCCAAGATG AGGCAACTTA 1800 

AGTCATTA 1808 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

TGGTCGTCAA TTTCTTGATT ATATCTATAA TCCTCATTTT CAATATTAGA 6TCTGTAGAA 60 

TCATCGATAT TATTATCATT CGCATGACTA GAAGCAGAAT CATTATTTTT ATCATTGCTT 120 

^5 TCTTCTTTTT TGAAGTCTTT ATTTATCAAG TAAATTTCTT CATCAAAATC AGCTTGTTGA 180 

GATGTATCAT CTTTATTTTG ATTAGAAAAA TGTGTAGCCT TTGATCTTTT TCTTTGCCGT 240 

CTTTTCTTAG ATGTATTCCT CGTAAATAAT TCTAATTCAT CTTTATCTTC ATTTGATTCT • 300 

20 TGTTGATCGT TCTTCGTTTT ATCATCCATC AATACTCACA CCCTTTAATA AGATGGTAAA 360 

TGGGCACGGA ATCTTTCAAT AAATTTCTCT CCACGCTCTT CAAAAGTACT ATATTGATCC 420 

CAACTCGCAC AAGCAGGTGA CAATAATACA ACATCATTTG GTTCTATAAT ATCTTGTACT 4 80 

TTATCAACAG CGTCTTCGAC ATTGTTCGCT TCAATGACCG ATTTCCCTTG ACTAITACCT 540 

AGTTTAGCAA ACTTAGCTTT CGTTTGTCCG AATACAACCA TCGCGCGAAC ATTTTCCATA 600 

TAAGGAATGA GTTCGTCAAA TTCATTCCCT OGATCCAAAC CACCACATAA CCAAATGATT 660 

GGTTGATTAA ATGAATTTAA GGCAAACTGT GTTGCTAGCG TGTTTGTTGC TTTGGAATCA 720 

TTATAATATT TATTAGTTCT ATTAGTACCA ACATATTGCA ATCTATGCTC TATTCCTGAA 780 

AATGTAGTTA AACTATCAAT AATTGCtTTA ATAGGTACAC CAGCanAATA CAAGCAAGCA 840 

CAGCTGCTAA TATATTTcTA AATTATGTTC ACCAGGCAAT ACTAGAtCTT CAGTGTTAAT 900 

AATaCSAACA CCTTTATeAA CGATAAAACC ATCTTtAATA TAAaTACCAT CArCTtCTTG 960 

40 TTGAGTTGAG AAATACAATG TCTTAGCTTT TAATTCTTCC GACTCTATCA CTTGTCTTTG 1020 

ATGATAATTA CAAATCAAAT AATCCTCTTC CGTTTGATTT TTATATATTT GCTTTTTAGC 1080 

ATTTTGATAG TTTTCTAAAT TTTCATGGTA ATCTAGATGC GCCGAATAAA TGTTAGTAAT 1140 

TATAGCAATG TGTGGTTTAT ACTTTTCGAT TCCAAGTAAC TGGAATGACG ACAACTCTGT 1200 

AACTAAATAA TCTGTAGGCT TTACTTCTTG TGCTACTTTA GATGCAACAT AACCAATATT 1260 

GCCGGATAAT CTTCCAGTTA AGCGACTTTT TTTAAACATA TCTCCAATTA GAGAAGTAAC 1320 
(2) INFORMATION FOR SEQ ID NO: 81: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4280 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

TTTACACCAA TCAAAAAATC GAACTGATAT AAATAAGTAC AAAGCTTATC TATCAATCCG 60 

,0 ATTTAGTTAT AAAACAAAAA AAGCCACAGT AATGTGGCTT TTTGTTATAT TCAGTATCAA 120 

AATGGTATCA ATAGCCATTT TCGGAAGTCA AGAATGGCTT AACAACGCOG TTTAAAGCTA 180 

TCCAATACTA CCTTCCATTT CGAACTTGAT TAAAOGGTTC ATTTCGACCG CGTATTCCAT 240 

IS TQGAAGTTCT TTTGTAAATG GTTCGATGAA TCCCATAACA ATCATTTCTG TCGCTTCTTC 300 

TTCAGAAATA CCACGACTCA TTAGATAGAA TAATTGTTCT TCAGAAACTT TTGAAACCTT 360 

GGCTTCATGT TCTAATGATA TTTGATCGTT GAATACTTCG TTATATGGAA TTGTATCTGA 420 

20 

TGTTGATTCG TTATCTAAGA TTAATGTATC ACATTCAATA TTTGAACGAG CACCTTTTGC 480 

TTTACGTCCA AAATGAACAA TACCGCGATA AATAACTTTA CCACCATTTT TAGAAATAGA 540 

TTTAGAAACA ATTGTAGAAG ATGTATTAGG TGCTTTATGA ATCATTTTAG CACCGGCATC 600 

25 

TTGAACTTGT CCTTTACCAG CAAATGCAAT AGATAATGTA CTACCTTTTG CACCTTCACC 660 

TAAAAGAACA CAGTTTGGAT ATTTCATCGT TAACTTAGAA CCTAAGTTAC CATCTACCCA 720 

TTCCATATTT CCGTTTTCAT AAACAAAAGT ACGTTTTGTA ACTAAATTGT ATACATT6TT 780 

CGCCCAGTTT TGAATCGTAG TATAACGAAC GTGCGCATCT TTATGCACAA TGATTTCCAC 840 

AACAGCAGAG TGTAAAGAAC TAGTTGTATA AACTGGTGCA GTACAACCTT CTACGTAATG 900 

35 TACAGAAGCA CCTTCATCAG CAATGATTAA TGTACGTTCA AATTGACCCA TGTTCTCAGA 960 

GTTAATACGG AAATAAGCTT GTAGTGGCGT ATCTAGTTTG ATATTTTTAG GTACATAAAT 1020 

GAAOTAACCA CCTGACCATA CTGCTGAGTT TAACGCCGCA AATTTGTTAT CTGCTGCAGG 1080 

TACTACAGAA GCAAAGTATT TTTTGAATAA TTCTTCATTT TCTTGTAAAG CACTATCTGT 1140 

ATCTTTAAAG ATAATACCTT TTTCTTCAAG TTCTTTTTCC ATATTATGGT AAACAACTTC 1200 

AGATTCATAT TGAGCAGAAA CACCAGCTAA ATATTTTTGT TCAGCTTCA6 GAATTCCTAA 1260 

45 

TTTATCGAAA GTTCTTTTAA TTTCTTCTGG CACTTCATCC CATGAACGTT CAGCTTGTTC 1320 

TGAAGGCTTT ACATAGTAAG TAATGTCATC GAAATTCAAT TCT6ATAAGT CGCCACCCCA 1380 

TTGAGGCATT GGCATTTTAT AAAACAATTT TAATGATTTA AGACGGAAAT CTAACATCCA 1440 

50 

TTCCGGCTCA TTTTTCATGT TAGAAATTTC TCTAACGATA TTCTCAGTIA AACCACGTTC 1500 

TGATCTGAAA ATGGACACAT CATCGTCGTG GAATCCATAT TTATAATCCC CAACATCAGG 1560 
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TTTAATTCAT GATGTAAACC ATATTATAAC AATGACATGA CATCTTATAA AAATTTTTAT 1680 

ACTTTTATAT GTCTAATATC AAAATTATCT ATGATTAACA GCATTCTATT CTTCTTCAGT 1740 

CGTACCTTCT GCTTTACCTT CTTTAGCAAC AGTACCTTTT TCCAATGCTT TCCAAGCTAA 1800 

TGTGGCACAT TTAATACGAG CTGGGAATTG AGATACACCT TGCAATGCTT CAATATCTCC 1860 

CATTTCTTCT GTAATCACAT AGTCTTCACC AAGCATCATT TTCGTAAATT CTTGGCTCAT 1920 

TTGCATTGCT TCTCCAAGTG AATGACCTTT AACAGCTTGT GTCATCATCG ATGCACTTGC 1980 

CATTGAAATC GAACAACCTT CACCTTCAAA CTTAGCATCT TTTATAATGC CGTCTTCTAT 2040 

ATCAAATGTT AGTCGTATAC 6GTCACCGCA TGTCGOGTTA TTCATATCTA CTGTCATAGA 2100 

CCCGTTATCT AATACACCTT TATTTCTAGG ATTTTTATAA TGATCCATAA TGACAGATCT 2160 

ATATAATTGA TCTAGATTAT TAAAATTCAT AAGAGAAAAA CTCCTTCGTT TGTTTCAAGG 2220 

20 CATTTATTAA CTGATCAACG TCTTCTTTCG TGTTGTATAT ATAAAAACTC GCTCTAGCTG 2280 

TTGAAGACAC ATTTAACCAT TTCATTAACG GTTGCGCACA ATGATGCCCA GCTCTAACCX5 2340 

CTACACCTTC TGTATCTACG GCTGTAGCAA CATCX5TGTGG ATGTACATCT TGTAAATTAA 2400 

25 ACGTTATTAC ACCTGCACGA CGATCCTTTG GCGGGCCATA AATTTCAATT CCTTCAATTG 2460 

CAGACATTTG CTCATAAGCA TATATCGTTA AITCTTGTTC ATATTTATGA ATTGCATCAA 2520 

AACCTATGCG TTCTAAATAG CGAATAGCTT CTGCAAGCCC AATTGCTTGA GCAATTAATG 2580 

GAGTACCCGC CTCAAATTTA GTAGGTAAAT CAGCCCATGT TGCATCATAC TTACTTACAA 2640 

AATCAATCAT GTCGCCACCG AACTCAATCG GTTCCATTTT TTGTAGTAAC TCACGTTTAC 2700 

CAAATAATAC GCCAATACCT GTTGGTCCAA GCATTTTATG ACCACTAAAA CTATAAAAAT 276 0 

CAGCATTCAT TTCTTGCATA TCAAGTTTCA TATGTGGTGC TGctTGCGCC CCATCAACAC 2820 

TGATfiATTGC ACCATGTTGA TGAGCTATTT CTGCAATGGT TTTAACATCA TTAATTGTAC 2880 

CGAGCACATT AQATATATGT GCAATAGCAA CGATCTTTGT TTTATCATTA ATCGTTTGCT 2940 

TAATATCCTC GATGTTTAAT TCACCGTCAG CTGTCATTGG TATAAATTTC AATGTCGCAT 3000 

TTTTACGCTT TGCTAACTGT TGCCAAGGAA CAATATTGGC ATGATGTTCC ATTTCAGTGA 3060 

CAACAATTTC ATCGCCCTCT TCAACATTTG CATCACCATA GCTATGTGCT ACAAGGTTAA 3120 

TCGACGCAGT TGTTCCX3CGT GTAAAAATQA TTTCTTCAAA ATACTTCGCA TTAATAAAAC 3180 

GACGAACGGT TTCACGGGCA TTTTCATAAC CATCAGTTGC CAATGATCCT AATGTATGAA 3240 

SO CACCACGATG AACGTTTGAA TTATAACGCT TGTAGTAATC TTCTAAAACA TTTAACACTT 3300 

GCACAGGCGT TTGACTTGTC GCTGTTGAAT CAAGATATGC TAAACGTTTG CCATTQACTT 3360 
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CTTCATTCAC 


GACCTTTCTT 


AAATAAAAAT 


CCTAATCATT TAAATACTGA 


CGTTGTATTA 


3480 




GTCTTATACC 


AATATCGACA 


GTCTATATCT 


ATTACAAACT TTTATTTTCA 


AAATATTATT 


3540 


5 


TAGAAACTTT 


GCGTTCAATT 


ACTTCTCTCA 


ATTGACGTTT AACGTCTTCG 


ATAGGTAATT 


3600 




CACGTACTAC 


TGGATCTAAG 


AAACCATGTA 


TAACAAGACG TTCCX3CTTCT 


CTTTGAGAAA 


3660 


10 


TACCACGACT 


CATTAAATAG 


TAAAGTTGAT 


CTGGATCAAC ACGACCTACT 


GATGCAGCAT 


3720 


GACCAGCTTG 


TACATCATCT 


TCATCAATTA 


ATAAAATAGG ATTOGCGTCA 


CCACGAGCAT 


3780 




GTTCAGATAA 


CATTAATACA 


CGTGATTCCT 


GATTAGCAAT TGATTTAGTT 


CCACCATGCT 


3840 


IS 


TAAT6TAGCC 


GATACCATTA 


AATACAGACX3 


ATGCATGTTC TTTCATAACA 


CCATGTTTAA 


3900 




GGATATAACC 


ATCTGTTTCT 


TTACCATATT 


GTACGATTTT AGATGTTAGA TTAATTTTTT 


3960 




GTTCGCCTGT 


ACCTACAACT 


ACT6ATTTAA 


GTGAACTTGT TGAACGATCA 


CCAAATAAAT 


4020 


20 


TTGTTGTATT 


ATCAATAATT 


TGGCTACCCT 


CATTCATTAA ACCTAGTGCC 


CAATTAATTG 


A nan 




AGGCATCCGC 


TTCAGTAATA 


CCACGTCGAA 


TGATAT6ACC TGTAAAGCCT 


TTATCCATAT 


4140 




AGTCCACTGA 


GCCATATGTG 


ATATTTGAAT 


TTGCACCAGC AATCACTTCA 6AAATAATAT 


4200 


25 


TtAATTGATT 


TCCTTCACCA 


GATGCATTTG 


ItlTAAGTAATT TTCAACATAT 


GTGACTTCGG 


4260 




CGCTTTCTTC 


AGTAACGATG 








4280 



(2) INFORMATION FOR SEQ ID NO: 82: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15598 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

~ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

TCnGACTCGA ACGGTGmAAC TAttCCGTTG TaATTCCgGA GgAAsCAAGG TATGCCCATC 60 

TGCaAAGAAA gaATGsAATG AACTTTTTGG AAATGTAGAA GTGGTAAATA AAGATAAAGG 120 

ATATTACATT CTGAGAAGTA TAAAAGCTTG AAATGAAATG GATATTCTGT TATAGTTATA 180 

45 TAATGTAAAA ATTTATGTTC AATAAGTGTG TACTTTTACG TTAAATAGAT AAGTTAATTA 240 

AGAATAAATA TAGAATCGAA AATGGTGTCA TCATTAGTGT TGCCGTTTTC TTTTTGTCTT 300 

TTTATTAATA TGCTTATGGT ATTTAGCTAA AAGCGGATCA CATAATTTTT GAGGGGTGAA 360 

*0 TCTGTTTGGC AGGTCAAGTT GTCCAATATG GAAGACATCG TAAACGTAGA AACTACGCGA 420 

GAATTTCAGA AGTATTAGAA TTACCAAACT TAATAGAAAT TCAAACTAAA TCTTACGAGT 480 
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CTGGTAATTT GTCATTAGAG TTTGTGGATT ACCGTTTAGG AGAACCAAAA TATGATTTAG 600 

AAGAATCTAA AAACCGTGAC GCTACTTATG CTGCACCTCT TCGTGTAAAA GTGCGTCTAA 660 

TCATTAAAGA AACAGGAGAA GTTAAAGAAC AAGAAGTCTT TATGGGTGAT TTCCCATTAA 720 

TGACTGATAC AGGTACGTTC GTTATCAATG GTGCAGAACG TGTAATCGTA TCTCAATTAG 780 

TTCGTTCACC ATCCGTTTAT TTCAATGAAA AAATCX5ACAA AAATGGTCGT GAAAACTATG 840 

ATGCAACAAT TATTCCAAAC CGTGGTGCAT GGTTAGAATA TGAAACAGAT GCTAAAGATG 900 

TTGTATACGT ACGTATTGAT AGAACACGTA AACTACCATT AACAGTATTG TTACGTGCAT 960 

TAGGTTTCTC AAGCGACCAA GAAATTGTTG ACCTTTTAGG TGACAATGAA TATTTACGTA 1020 

ATACTTTAGA GAAAGACGGC ACTGAAAACA CTGAACAAGC GTTATTAGAA ATCTATGAAC 1080 

GTTTACGTCC AGGTGAACCA CCAACTGTTG AAAATGCTAA AAGTCTATTG TATTCACGTT 1140 

20 TCTTTGATCC AAAACGCTAT GACTTAGCAA GCGTGGGTCG TTATAAAACA AACAAAAAAT 1200 

TACATTTAAA ACATCGTTTA TTTAATCAAA AATTAGCTGA GCCAATTGTA AATACTGAAA 1260 

CTGGTGAAAT TGTAGTTGAA GAAGGTACAG TGCTTGATCG TCGTAAAATC GACGAAATCA 1320 

25 TGGATGTACT TGAATCAAAT GCAAACAGCG AAGTGTTTGA ATTGCATGGT AGCGTTATAG 1380 

AOGAGCCAGT AGAAATTCAA TCAATTAAAG TATATGTTCC TAACGATGAT GAAGGTCGTA 1440 

CGACAACTGT AATTGGTAAT GCTTTCCCTG ACTCAGAAGT TAAATGCATT ACACCAGCAG 1500 

ATATCATTGC TTCAATGAGT TACTTCTTTA ACTTATTAAG CGGTATTGGA TATACAGATG 1560 

ATATTGACCA TTTAGGTAAC CGTCGTTTAC GTTCTGTAGG TGAATTACTA CAAAACCAAT 1620 

TCCGTATCGG TTTATCAAGA ATGGAAAGAG TTGTACGTGA AAGAATGTCA ATTCAAGATA 1680 

CTGAGTCTAT CACACCTCAA CAATTAATTA ATATTCGACC TGTTATTGCA TCTATTAAAG 1740 

AAnCrrrGG tagctctcaa ttatcacaat tcatggacca agcaaaccca ttagctgagt iboo 

TAACGCATAA ACGTCGTCTA TCAGCATTAG GACCTGGTGG TTTAACACX5T GAACX3TGCTC 1860 

AAATGGAAGT ACGTGACGTT CACTACTCTC ACTATGGCCG TATGTGTCCA ATTGAAACAC 1920 

CTGAGGGACC AAACATTGGA TTGATTAACT CATTATCAAG TTATGCAOGT 6TAAATGAAT 1980 

^ TCGGCTTTAT TGAAACACCA TATCGTAAAG TTGATTTAGA TACACATGCT ATCACTGATC 2040 

AAATTGACTA TTTAACAGCT GACGAAGAAG ATAGCTATGT TGTAGCACAA GCAAACTCTA 2100 

AATTAGATGA AAATGGTCGT TTCATGGATG ATGAAGTTGT ATGTCGTTTC CGTGGTAACA 2160 

SO ATACAGTTAT GGCTAAAGAA AAAATGGATT ATATGGATGT ATCGCCGAAG CAAGTTGTTT 2220 

CAGCAGCGAC AgcATGTATT CCATTCTTAG AAAATGATGA CTCAAACCGT GCATTGATGG 2280 
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CAGGTATGGA ACACGTTGCA GCACGTGATT CTGGTGCGGC TATTACAGCT AAGCACAGAG 2400 

GTCGTGTTGA ACATGTTGAA TCTAATGAAA TTCTTGTTCG TCGTCTAGTT GAAGAGAACG 2460 

5 

GCGTTGAGCA TGAAGGTGAA TTAGATCGCT ATCCATTAGC TAAATTTAAA CGTTCAAACT 2520 

CAGGTACATG TTACAACCAA CGTCCAATCG TTGCAGTTGG AGATGTTGTT GAGTATAACG 2580 

AGATTTTAGC AGATGGACCA TCTATGGAAT TAGGAGAAAT GGCATTAGGT AGAAACGTAG 2640 

10 

TAGTTGGTTT CATGACTTGG GACGGTTACA ACTATGAGGA TGCCGTTATC ATGAGTGAAA 2700 

GACTT6TGAA AGATGACGTG TATACTTCTA TTCATATTGA AGAGTATGAA TCAGAAGCAC 2760 

GTGATACTAA GTTAGGACCT GAAGAAATCA CAAGAGATAT TCCTAATGTT TCTGAAAGTG 2B20 

CACTTAAGAA CTTAGACGAT CGTGGTATCG TTTATATTGG TGCAGAAGTA AAAGATGGAG 2880 

ATATTTTAGT TGGTAAAGTA ACGCCTAAAG GTGTAACTGA GTTAACTGCC GAAGAAAGAT 2940 

20 TGTTACATGC AATCTTTGGT GAAAAAGCAC GTGAAGTTAG AGATACTTCA TTACGTGTAC 3000 

CTCACGGCGC TGGCGGTATC GTTCTTGATG TAAAAGTATT CAATCGTGAA GAAGGCGACG 3060 

ATACATTATC ACCTGGTGTA AACCAATTAG TACGTGTATA TATC6TTCAA AAACGTAAAA 3120 

TTCATGTTGG TGATAAGATG TGTGGTCGAC ATGGTAACAA AGGTGTCATT TCTAAGATTG 3180 

TTCCTGAAGA AGATATGCCT TACTTACCAG ATGGACGTCC GATCGATATC ATGTTAAATC 3240 

CTCTTGGTGT ACCATCTCGT ATGAACATCG GACAAGTATT AGAGCTACAC TTAGGTATGG 3300 

30 

CTGCTAAAAA TCTTGGTATT CACGTTGCAT CACCAGTATT TGACGGTGCA AACGATGACX5 3360 

ATGTATGGTC AACAATTGAA GAAGCTGGTA TGGCTCGTGA TGGTAAAACT GTACTTTATG 3420 

ATGGACGTAC AGGTGAACCA TTCGATAACC GTATTTCAGT AGGTGTAATG TACATGTTGA 3480 

35 

AACTTGCGCA CATGGTTGAT GATAAATTAC ATGCGCGTTC AACAGGACCA TATTCACTTG 3540 

tXAC^CAACA ACCACTTGGC GGTAAAGCGC AATTCX3GTGG ACAACGTTTT GGTGAGATGG 3600 

AGGTATGGGC ACTTGAAGCA TATGGTGCTG CATACACATT ACAAGAAATC TTAACTTACA 3660 

40 

AATCCGATGA TACAGTAGGA CGTGTGAAAA CATACGAGGC TATTGTTAAA GGTGAAAACA 3720 

TCTCTAGACC AAGTGTTCCA GAATCATTCC GAGTATTGAT GAAAGAATTA CAAAGTTTAG 3780 

45 GTTTAGATGT AAAAGTTAT6 GATGAGCAAG ATAATGAAAT CGAAATGACA GACGTTGATG 3840 

ACGATGATGT TGTAGAACGC AAAGTAGATT TACAACAAAA TGATGCTCCT GAAACACAAA 3900 

AAGAAGTTAC TGATTAATAC GCAATTTACA AAACAGGCAA AAAGATACTA AGCTGAATTT 3960 

^ TATTGATGAT TCAGTTTAGT ACTTTAAGCC ATTTTAAATA AATGCAAATC AATCAAATAG 4020 

CACAGCTAAT CTAAATTGAA GGAGGTAGGC TCCTTGATTG ATGTAAATAA TTTCCATTAT 4080 
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AAACCTGTU^ CAATCAACTA CCGTACATTA AAACCTGAAA AAGATGGTCT ATTCTGTGAA 4200 

AGAATTTTCG GACCTACAAA AGACTGGGAA TGTAGTTGTG GTAAATACAA ACGTGTTCGC 4260 

TACAAAGGCA TGGTCTGTGA CAGATGTGGA GTTGAAGTAA CTAAATCTAA AGTACGTCGT 4320 

GAAAGAATGG GTCACATTGA ACTTGCTGCT CCAGTTTCTC ACATTTGGTA TTTCAAAGGT 4380 

ATACCAAGTC GTATGGGATT ATTACTTGAC ATGTCACCAA GAGCATTAGA AGAAGTTATT 4440 

TACTTTGCTT CTTATGTTGT TGTAGATCCA GGTCCAACTG GTTTAGAAAA GAAAACTTTA 4500 

TTATCTGAAG CTGAATTCAG AGATTATTAT GATAAATACC CAGGTCAATT CGTTGCAAAA 4560 

ATGGGTGCAG AAGGTATTAA AGATTTACTT GAAGAGATTG ATCTTGAC3GA AGAACTTAAA 4620 

TTGTTACX3CG ATGAGTTGGA ATCAGCTACT GGTCAAAGAC TTACTCGTGC AATTAAACGT 4680 

TTAGAAGTTG TTGAATCATT CCGTAATTCA GGTAACAAAC CTTCATGGAT GATTTTAGAT 4740 

20 GTACTTCCAA TCATCCCACC AGAAATTCGT CCAATGGTTC AATTAOATGG TGGACGATTT 4800 

GCAACAAGTG ACTTAAACGA CTTATACCGT CGTGTAATTA ATCGAAATAA TCGTTTGAAA 4860 

CGTTTATTAG ATTTAGGTGC ACCTGGTATC ATCGTTCAAA ACGAAAAACG TATGTTACAA 4920 

25 GAAGCCGTTG ACGCTTTAAT TGATAATGGT CGTCGTGGTC GTCCAGTTAC TGGCCCAGGT 4980 

AACCGTCCAT TAAAATCTTT ATCTCATATG TTAAAAGGTA AACAAGGTCG TTTCCGTCAA 5040 

AACTTACTTG GTAAACGTGT TGACTATTCA GGACGTTCAG TTATTGCAGT AGGTCCAAGC 5100 

TTGAAAATGT ACCAATGTGG TTTACCAAAA GAAATGGCAC TTGAACTATT TAAACCATTC 5160 

GTAATGAAAG AATTAGTTCA ACGTGAAATT GCAACTAACA TTAAAAATGC GAAGAGTAAA 5220 

ATCGAACGTA TGGATGATGA AGTTTGGGAC GTATTGGAAG AAGTAATTAG AGAACATCCT 5280 

GTATTACTTA ACCGTGCACC AACACTTCAT AGACTTGGTA TTCAAGCATT TGAACCAACT 5340 

TTAGTTGAAG GTCGTGCGAT TCGTCTACAT CCACTT6TAA CAACAGCTTA TAACGCTGAC 5400 

TTTGACGGTG ACCAAATGGC QGTTCACGTT CCTTTATCAA AAGAGGCACA AGCTGAAGCA 5460 

AGAATGTTGA TGTTAGCAGC ACAAAACATC TTGAACCCTA AAGATGGTAA ACCTGTAGTT 5520 

ACACCATCAC AAGATATGGT ACTTGGTAAC TATTACCTTA CTTTAGAAAG AAAAGATGCA 5580 

4S GTAAATACAG GCGCAATCTT TAATAATACA AATGAAGTAT TAAAAGCATA TGCAAATGGC 5640 

TTTGTACATT TACACACTAG AATTGGTGTA CATGCAAGTT CGTTCAATAA TCCAACATTT 5700 

ACTGAAGAAC AAAACAAAAA GATTCTTGCT ACGTCAGTAG GTAAAATTAT ATTCAATGAA 5760 

^ ATCATTCCAG ATTCATTTGC TTATATTAAT GAACCTACGC AAGAAAACTT AGAAAGAAAG 5820 

ACACCAAACA GATATTTCAT CGATCCTACA ACTTTAGGTG AAGGTGGATT AAAAGAATAC 5880 



55 



30 



35 



40 



534 



EP 0 786 519 A2 



GAAGTATTCA ACAGATTTAG CATCACTGAT 
TTAGGATTCA AATTCTCATC TAAAGCTGGT 
^ TTACCTGATA AGCAACAAAT ACTTGATGAG 

CAATTCAACC GTGGTTTAAT CACTGAAGAA 
ACAGATGCAA AAGATCAAAT TCAAGGTGAA 

10 

ATCTTCATGA TGAGTGATTC AGGTGCCCGT 
GGTATGCGTG GATTGATGGC CGCACCATCT 
TCATTCCGTG AAGGTTTAAC AGTACTTGAA 

IS 

GGTCTTGCCG ATACAGCACT TAAAACAGCT 
GACGTGGCAC AAGATGTTAT TGTTCGTGAA 

20 GTTTCTGATA TTAAAGAAGG TACAGAAATG 

CGTTATTCTA AAGAAACAAT TCGTCATCCT 
GAATTAATTA CACCTGAAAT TGCTAAGAAA 

2S ATTCGCTCAG CATTTACTTG TAACGCACGA 
AACCTTGCTA CTGGTGAAAA AGTTGAAGTT 
TCTATCGGTG AACCAGGTAC ACAGCTTACA 

30 

GGTAGCGATA TCACACAAGG TCTTCCTCGT 
AAAGGTCAAG CGGTAATTAC GGAAATCGAA 
GATAGACAAC AAGAAATTGT TGTTAAAGGT 

35 

GGTACTTCAA GAATTATTGT AGAAATCGGT 
GAAGGTTCTA TTGAACCTAA GAATTACTTA 
TACTTATTAA AAGAAGTACA AAAAGTTTAC 

40 

CACGTTGAGG TTATGGTTCG ACAAATGTTA 
ACGAAGTTAT TACCAGGTTC ATTAGTTGAT 

45 GCATTTAAAC ACCGTAAGCG TCCT6CAACA 

GCATCACTTG AAACAGAAAG TTTCTTATCT 
CTTACAGATG CAGCAATTAA AGGTAAGCGT 

SO ATTATTGGTA AGTTAATTCC AGCTGGTACT 
GAAAAAACA6 CTAAACCAGT TGCAGAAGTT 

55 



ACATCAATGA TGTTAGACCG TATGAAAGAC 6000 

ATTACAGTAG GTGTTGCTGA TATCGTAGTA 6060 

CATGAAAAAT TAGTCGACAG AATTACAAAA 6120 

GAAAGATATA ATGCAGTTGT TGAAATTTGG 6180 

TTGATGCAAT CACTTGATAA AACTAACCCA 6240 

GGTAACGCAT CTAACTTTAC ACAGTTAGCA 6300 

GGTAAGATTA TCGAATTACC AATCACATCT 6360 

TACTTCATCT CAACTCACGG TGCACGTAAA 6420 

GACTCAGGAT ATCTTACTCG TCGTCTTGTT 6480 

GAAGACTGTG GTACTGATAG AGGTTTATTA 6540 

ATTGAACCAT TTATCGAACX3 TATTGAAGGT 6600 

GAAACTGATG AAATAATCAT TC6TCCTGAT 6660 

ATTACAGATG CTGGTATTGA ACAAATGTAT 6720 

CATGGTGTTT GTGAAAAATG TTACGGTAAA 6780 

GGTGAAGCAG TTGGTACAAT TGCAGCCCAA 6840 

ATGCGTACAT TCCATACAGG TGGGGTAGCA 6900 

ATTCAAGAGA TTTTCGAAGC ACGTAACCcT 6960 

GGTGTCGTAG AAGATATTAA ATTAGCAAAA 7020 

GCTAATGAAA CAAGATCATA CCTTGCTTCA 7080 

CAACCAGTTC AACGTGGTGA AGTATTAACT 7140 

TCTGTTGCTG GATTAAACGC GACTGAAAGC 7200 

CGTATGCAAG GTGTAGAAAT CGACGATAAA 7260 

CGTAAAGTTA GAATTATCQA AGCAGGTGAT 7320 

ATTCATAACT TTACAGATGC AAATAGA6AA 7380 

GCTAAACCAG TATTACTTGG TATTACTAAA 7440 

GCAGCATCAT TCCAAGAAAC AACAAGAGTT 7500 

GATGACTTAT TAOGTCTTAA A6AAAACGTA 7560 

GGTATGAGAC GTTATAGCGA CGTAAAATAC 7620 

GAATCTCAAA CTGAAGTAAC GGAATAACAA 7680 



535 



EP0 786 519 A2 



10 



IS 



ATGTTGACGA ATTCTCTTGT TCAATGTTAA TATATTAAAG GTTGATGCAA GCAGAACTTT 7800 

GGAGGATAAA TTATTGTCTA AGGAAAAAGT tGCACGCTTT AACAAACAAC ATTTTGTAGT 7860 

TGGTCTTAAA GAAACGCTTA AAGCGTTAAA GAAAGATCAA GTTACATCTT TGATTATTGC 7920 

TGAAGACGTT GAAGTATATT TAATGACTCG CGTGTTAAGC CAAATCAATC AGAAAAATAT 7980 

ACCTGTATCT TTTTTCAAAA GCAAACATGC TTTGGGTAAA CATGTAGGTA TTAACGTCAA 8040 

TGCGACAATA GTAGCATTGA TTAAATGAGA ATTAGTAAGT GTTTTACTTA CTAAATTTTA 8100 

TTTAACCTAA AAATGAACCA CCTGGATGT6 TGGGATTAAA AAGTGAAGAG AGGAGGACAT 8160 

ATCACATGCC AACTATTAAC CAATTAGTAC GTAAACCAAG ACAAAGCAAA ATCAAAAAAT 8220 

CAGATTCTCC AGCTTTAAAT AAAGGTTTCA ACAGTAAAAA GAAAAAATTT ACT6ACTTAA 6280 

ACTCACCACA AAAACGTGGT GTATGTACTC GTGTAG6TAC AATGACACCT AAAAAACCTA 8340 

20 ACTCAGCGTT ACGTAAATAT GCACGTGTGc gTtTATCAAA CAACATCGAA ATTAACGCAT 8400 

ACATCCCTGG TATCGGACAT AACTTACAAG AACACAGTGT TGTACTTGTA CGTGGTGGAC 8460 

GTGTAAAAGA CTTACCy^GGT GTGCGTTACC ATATTGTACG TGGAGCACTT GATACTTCAG 8520 

25 GTGTTGACGG ACGTAGACAA GGTCGTTCAT TATACGGAAC TAAGAAACCT AAAAACTAAG 8580 

AATTTAGTTT TTAATTAAAT CTTAAACTTA AAATATTTAA TATAAGGAAG GGAGGATTTA 8640 

CATTATGCCT CGTAAAGGAT CAGTACCTAA AAGAGACGTA ITACCAGATC CAATTCATAA 8700 

CTCTAAGTTA GTAACTAAAT TAATTAACAA AATTATGTTA GATGGTAAAC GTGGAACAGC 8760 

ACAAAGAATT CTTTATTCAG CATTCGACCT AGTTGAACAA CGCAGgtTCG TGATGCATTA 8820 

GAAGTATTCG AAGAAGCAAT CAACAACATT ATGCCAGTAT TAGAAGTTAA AGCTCGTCGC 8880 

GTAGGTGGTT CTAACTATCA AGTACCAGTA GAAGTTCGTC CAGAGCGTCG TACTACTTTA 8940 

GGTTTACGTT GGTTAGTTAA CTATGCACGT CTTCGTGGTG AAAAAACGAT GGAAGATCGT 9000 

TTAGCTAACG AAATTTTAGA TGCAGCAAAT AATACAGGTG GTGCCGTTAA GAAACGTGAG 9060 

GACACTCACA AAATGGCTGA AGCAAACAAA GCATTTGCTC ACTACCGTTG GTAAGATAAA 9120 

AGCTTTTACC CTGAGTGTGT TCTATATTAA TGAATTTTCA TTAAGCGTTC ATGCTTAGGG 9180 

45 CATCGCCATA TCTATCGTAT TTATTCAGTA ATATAAACTG GAAGGAGAAA AAATACATGG 9240 

CTAGAGAATT TTCATTAGAA AAAACTCGTA ATATCGGTAT CATGGCTCAC ATTGATGCTG 9300 

GTA/y^ACGAC TACGACTGAA CGTATTCTTT ATTACACTGG CCGTATCCAC AArGknGGTG 9360 

SO AAaCACACGA AGGTGCTTCA CAAATGGACT GGATGGAGCA AGAACAAGAC CGTGGTATTA 9420 

CTATCACATC TGCTGCAACA ACAGCAGCTT GGGAAGGTCA CCGTGTAAAC ATTATCGATA 9480 
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CAGTTACAGT ACTTGATGCA CAATCAGGTG TTGAACCTCA AACTGAAACA GTTTGGCGTC 9600 

AGGCTACAAC TTATGGTGTT CCACGTATCG TATTTGTAAA CAAAATGGAC AAATTAGGTG 9660 

^ CTAACTTCGA ATACTCTGTA AGTACATTAC ATGATCGTTT ACAAgCTAAC GCTGCTCCAA .9720 

TCCAATTACC AATTGGTGCG GAAGACGAAT TCXjAAGCAAT CATTGACTTA GTTGAAATGA 9780 

AATGTTTCAA ATATACAAAT GATTTAGGTA CTGAAATTGA AGAAATTGAA ATTCCTGAAG 984 0 

10 

ACCACTTAGA TAGAGCTGAA GAAGCTCGTG CTAGCTTAAT CGAAGCAGTT GCAGAAACTA 9900 

GCGACGAATT AATGGAAAAA TATCTTGGTG ACGAAGAAAT TTCAGTTTCT GAATTAAAA6 9960 

AAGCTATCCO CCAAGCTaCt AcTAACGTAG AATTCTACCC AGTACTTTGT GGTACAGCTT 10020 

75 

TCAAAAACAA AGGTGTTCAA TTAATGCTTG ACX3CTGTAAT TGATTACTTA CCTTCACCAC 10080 

TAGACGTTAA ACCAATTATT GGTCACCGTG CTAGCAACCC T6AAGAAGAA GTAATCGCGA 10140 

2Q AAGCAGACGA TTCAGCTGAA TTCGCTGCAT TAGCGTTCAA AGTTATGACT GACCCTTATG 10200 

TTGGTAAATT AACATTCTTC CGTGTGTATT CAGGTACAAT GACATCTGGT TCATACGTTA 10260 

AGAACTCTAC TAAAGGTAAA CGTGAACGTG TAGGTCGTTT ATTACAAATG CACGCTAACT 10320 

25 CACGTCAAGA AATCGATACT GTATACTCTG GA6ATATCGC TGCTGCGGTA GGTCTTAAAG 10380 

ATACAGGTAC TGGTGATACT TTATGTGGTG AGAAAAATGA CATTATCTTG GAATCAATGG 10440 

AATTCCCAGA GCCAGTTATT CACTTATCAG TAGAGCCAAA ATCTAAAGCT GACCAAGATA 10500 

^ AAATGACTCA AGCTTTAGTT AAATTACAAG AAGAAGACCC AACATTCCAT GCACACACTG 10560 

ACGAAGAAAC TGGACAAGTT ATCATCGGTG GTATGGGTGA GCTTCACTTA GACATCTTAG 10620 

TAGACCGTAT GAAGAAAGAA TTCAACGTTG AATGTAACGT AGGTGCTCCA ATGGTTTCAT 10680 

35 

ATCGTGAAAC ATTCAAATCA TCTGCACAAG TTCAAGGTAA ATTCTCTCGT CAATCTGGTG 10740 

GTCGTGGTCA ATACGGTGAT GTTCACATTG AATTCACACC AAACGAAACA GGCGCAGGTT 10 BOO 

TCGAATTCGA AAACGCTATC GTTGGTGGTG TAGTTCCTCG TGAATACATT CCATCAGTAG 10B60 

40 

AAGCTGGTCT TAAAGATGCT ATGGAAAATG GTGTTTTAGC AGGTTATCCT TTAATTGATG 10920 

TTAAAGCTAA ATTATATGAT GGTTCATACC ATGATGTCGA TTCATCTGAA ATGGCCTTCA 10980 

45 AAATTGCTGC ATCATTAGCA CTTAAAGAAG CTGCTAAAAA ATGTGATCCT GTAATCTTAG 11040 

AACCAATGAT GAAAGTAACT ATTGAAATGC CTGAAGAGTA CATGGGTGAT ATCATGGGTG 11100 

ACGTAACATC TCGTCGTGGA CGTGTTGATG GTATGGAACC TCX5TGGTAAT GCACAAGTTG 11160 

50 . TTAATGCTTA TGTACCACTT TCAGAAATGT TCX3GTTATGC AACATCATTA CGTTCAAACA 11220 

CTCAAGGTCG CGGTACTTAC ACTATGTACT TCXSATCACtA TGCTGAAGTT CCaAAATCaA 11280 
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GCCTAGGTTA AAATACAAGG TGAGCTTAAA TGTAAGCTAT CATCTTTATA GTTTGATTTT 
TTGGGGTGAA TGCATTATAA AAGAATTGTA AAATTCTTTT TGCATCGCTA TAAATAATTT 
CTCATGATGG TGAGAAACTA TCATGAGAGA TAAATTTAAA TATTATTTTT AATTAGAATA 
GGAGAGATTT TATAATGGCA AAAGAAAAAT TCGATCGTTC TAAAGAACAT GCCAATATCG 
GTACTATCGG TCACGTTGAC CATGGTAAAA CAACATTAAC AGCAGCAATC GCTACTGTAT 
TAGCAAAAAA TGGTGACTCA GTTGCACAAT CATATGACAT GATTGACAAC GCTCCAGAAG 
AAAAAGAACG TGGTATCACA ATCAATACTT CTCACATTGA GTACCAAACT GACAAACGTC 
ACTAOOCTCA CGTTGACTGC CCAGGACACG CTGACTACX5T TAAAAACATG ATCACTGGTG 
CTGCTCAAAT GGACGGCGGT ATCTTAGTAG TATCTGCTGC TGACGGTCCA ATGCCACAAA 
CTCGTGAACA CATTCTTTTA TCACGTAACG TTGGTGTACC AGCATTAGTA GTATTCTTAA 
ACAAAGTTGA CATGGTTGAC GATGAAGAAT TATTAGAATT A6TAGAAATG GAAGTTCGTG 
ACTTATTAAG CGAATATGAC TTCCCAGGTG ACGATGTACC TGTAATCGCT GGTTCAGCAT 
TAAAAGCTTT AGAAGGCGAT GCTCAATACG AAGAAAAAAT CTTAGAATTA ATGGAAGCTG 
TAGATACTTA CATTCCAACT CCAGAACGTG ATTCTGACAA ACCATTCATG ATGCCAGTTG 
AGGACGTATT CTCAATCACT GGTCGTGGTA CTGTTGCTAC AGGCCGTCTT GAACGTGGTC 
AAATCAAAGT TGGTGAAGAA GTTGAAATCA TCGGTTTACA TGACACATCT AAAACAACTG 
TTACAGGTGT TGAAATGTTC CGTAAATTAT TAGACTACGC TGAAGCTGGT GACAACATTG 
GTGCATTATT ACGTGGTGTT GCTCGTGAAG ACGTACAACG TGGTCAAGTA TTAGCTGCTC 
CTGGTTCAAT TACACCACAT ACTGAATTCA AAGCAGAAGT ATACGTATTA TCAAAAGACG 
AAGGTGGACG TCACACTCCA TTCTTCTCAA ACTATCGTCC ACAATTCTAT TTCCGTACTA 
CTGAGGTAAC TGGTGTTGTT CACTTACCAG AAGGTACTGA AATGGTAATG CCTGGTGATA 
AC6TT6AAAT GACAGTAGAA TTAATCGCTC CAAT0GCX5AT TGAAGACGGT ACTCGTTTCT 
CAATCCGTGA AGGTGGACGT ACTGTAGGAT CAGGCGTTGT TACTGAAATC ATTAAATAAT 
TTCTAATTTC TTAGATTTTA TATAAAAAGA AGATCCCTCA ATCGAGGGGt CTTTTTTTAA 
TGTGTAAATT TTGTAATGGC TATTCGATTT AGAAGAACAA TAATTGATGA AAGACTGACT 
AATAAAACTT ATAACTGATA ATACTGTTTA AATAAAATTG TTGAGTCTTG GACATTGTAA 
AATGCTCCCT TCAAAGTTTT CATTTTTTCa ATGTCTACTT TGAAGGGAGC ATTTCATTAG 
TTTATGTCTC AGATTCATAT CTTTCAATTA ATTTAAATGC TTAATTTGTT TTAAATACTT 
GCTCTAATTC TATGATTTTT AAAAATACAG CTACAGCGTA TTTTAATGAT TTTTCATCAA 
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12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 
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TCAGAAAGAA TGCACCTGGT CGTACTTTCA AATAATGTGA AAAATCTTCT CCAATCATCA 13200 

TTAAATCTGA TTCATTAAAG CGTACATGTA AGTCATTTGT TGCTTCTTTA ATAACTTGAT 13260 

5 ATGCTTTCTC GTTATTATGG ACAGGCAAAT ACCCTTTAAT ATAATTCAAA TCATAGTTAA 13320 

TATCATTTGC TATTGCTAAA CCTTGTAGAA GCTTATCCAT TTTGTCCATT ACATGATTCT 13380 

GTATATCTGA ATCX5AAAGTT CTAACTGTAC CTTTACAAAA TGCTTGATCA GGAATAACGC 1344 0 

10 

TATCTGTGGT GCCTGCTTGA ATCATTCCAA ATGAAAGTAC AGCTTGTTTA ACTGGATCGA 13500 

TCGTACGTGA AATTATTTTT TGTGCACTTA AAATGAACTC TGCCATGATT ACTATTGGGT 13560 

CAATGGTTTC ATGAGGTTTG GCACCATGAC CACCACGACC TTTAAATGTG ACGCTAAATT 13620 

75 

CATCTGQA6A GGCCATGATT GCCCXTCGCAC GTGAATGAAT AGTTCCAGTA GGATAACCAC 13680 

TCCATAAATG TGTACCGTAA ATTCTATCTA CATTTTCCAG ACATCCAGCA TCTATCATTT 13740 

2^ CTTGAGAACC ACCTGGCATG ATTTCTTCAC CGTACTGGAA TATTAATACA ACATTACCTT 13800 

CTAATAAATG TTTATGTTCA TCTAAAATCT CTGCTACAGT AAGTAAAATT GCTGTATGAC 13860 

CATCATGCCC ACACGCATGC ATACATCCTG GATTTTTAGA CTTATAAGGC ACATCGTTTA 13920 

25 ATTCCTCGAC AGGTAACGCA TCAAAGTCAG CTCTTAATGC AATGGTAGGT CCTGTGCCCA 13 9 BO 

AGCCTTTAAA TGTGGCTTTG ATACCATTGC GGCCGATAGG AGTTTCAATA TCACAAGATA 14040 

ACTGGCTTAA TTGGTTAACA ATATAATCAT GTGTTTGAAA TTCTTCAAAA GATAACTCAG 14100 

^ GATATTGGTG TAAATAACGT CTGAGTTGAA TTGTTTTATT TTCTTTATTA TTTGCTAGTT 14160 

GGAACCAATC TAACACCCTT ATCACTACTT TCTAAAATAA TGTTTATAGT ATAACATTTT 14220 

ATGAAATTAT CGTACTAAAT GATTGCTTTG AGATATTTTA TCTATGAATG ATAAGGCTTT 14280 

35 

CAAGTTATGT AGAATTACTG TATGATAAAG GTATTACCAA ACAATACTTA AGGGGGATTA 14340 

TATACTGTGG TTCAATCATT ACATGAGTTT TTAGAGGAAA ATATAAATTA TCTAAAAGAA 14400 

AATGGTTTGT ATAATGAAAT AGATACAATT GAAGGTGCAA ACGGACCAGA AATCAAAATC 14460 

40 

AATGGGAAAT CATACATTAA CTTATCTTCA AATAATTATT TAGGACTAGC AACAAATGAA 14520 

GATTTGAAAT CaGctGCAAA AGCAGCTATT GATACACATG OTGTAGGTGC AGGCGCTGTT 14580 

CGTACAATCA ATGGTACATT AGATTTACAC GACGAATTAG AAGAAACACT AGCAAAATTT 14640 

AAAGGAACAG AAGCTGCAAT AGCTTATCAA TCAGGATTTA ATTGTAATAT GGCTGCTATT 14700 

TCAGCTGTCA TGAATAAAAA TGATGCTATT TTATCAGATG AGCTTAATCA TGCATCAATT 14760 

SO ATTGATGGAT GTCGCTTATC TAAAGCTAAA ATTATTCGAG TTAACCATTC AGACATGGAT 14820 

GATTTACGTG CGAAAGCAAA AGAAGCAGTT GAATCAGGTC AATACAATAA AGTGATGTAT 14880 
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IS 



ATTGCAGAAG AATTTGGTTT ATTAACTTAT GTTGACGACG CTCATGGTTC AGGTGTTATG 15000 

GGTAAAGGCG CTGGTACGGT TAAACATTTT GGTTTACAAG ATAAAATCGA TTTCCAAATA 15060 

GGTACGCTTT CTAAAGCAAT TGGTGTCGTT GGCGGTTATG TAGCAGGTAC AAAAGAGTTA 15120 

ATAGATTGGT TAAAAGCACA ATCACGACCA TTCTTATTCT CTACATCATT AGCACCTGGG 15180 

GATACCAAAG CAATAACTGA AGCAGTTAAA AAGTTAATGG ATTCAACTGA ATTACATGAT 15240 

AAATTATGGA ACAATGCACA ATATTTAAAA AATGGATTGT CAAAATTAGG ATATGATACA 15300 

GGTGAGTCAG AAACTCCAAT TACACCAGTA ATTATTGGTG AT6AAAAAAC AACTCAAGAA 15360 

TTTAGTAAGC GTTTAAAAGA CQAAGGTGTC TATGTGAAAT CTATCGTTTT CCCAACAGTA 15420 

CCAAGAGGTA CAGGAOGTGT AAGAAATATG CCTACAGCTG CACATACAAA AGACATGTTA 154 BO 

GATGAAGCAA TTGCGGCTTA TGAAAAAGTA GGAAAAGAAA TGAAGTTGAT TTAATATTTA 15540 

20 TTTATTCCCA CGGCAAATAT TGTCGTGGGC TTTTTTTAAT GTTTAGTTTA TTAACAGT 15598 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 661 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

AAGTAAATCA ACTTACTGGG ATAAGAATAA AGGCGATTAT AGTAACAAGT TGATTTTATT 60 

CGAAAAACAT TTTGAACCGG TTCTGGGTAT CAAGATGCAA CATAGTGGAG GTCATAGCTT 120 

TGGCCACACG ATTATTACGA TTGAAAGTCA AGGAGATAAA GCAGTTCATA TGGGTGATAT 180 

ATTCCCAACT ACTGCACATA AAAATCCTCT ATGGGTAACG GCATATGATG ATTATCCTAT 240 

GCAATCGATT CGTGAAAAAG AACGCATGAT ACCATATTTT ATTCAGCAAC AATATTGGTT 300 

CTTGTTTTAT CATGATGAAA ACTACTTTGC TGTAAAATAC A6CQATAATG GTGAAAACAT 360 

AGATGCATAT ATTTTACGTG AAACATTAGT TGATAATAAC TAAAATAAAG ATGTATTACT 420 

^ AAACAAATTT TCAAAAATAA AAAATTGAGC CACATCCAAT CTTACTAATT AGGGTGTGGC 480 

TCATTTTTAA GTTTTACgAT CCAAATCAAA TATGGaTAAA ATTCgTATTA ACGCTCTACa 540 

ATGtTAATGA CTTCACCAGT ATATGCATCT GCATAAAAAT CATAATGAAT ATTTTGACCA 600 

SO TTTTTAATAG TTGTAATTCC ACCTTGATAA ACTAAACGGT ATTTATCAGT TTCAGGATGA 660 

A 661 



ss 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

10 

GCAGACGGTA CAGCAGTTAA AGTCGCACCA AaACTGTAGT GAATcTAATC GGTGcATTCT 60 

TTTTAGGATT AGTTGTCGCG CTTATATATA TCTTCTTCAA AGTAATTTTC GATAAGCGAA 120 

TTAAAGATGA AGAAGATGTA GAGAAAGAAT TAGGATTGCC TGTATTGGGT TCAATTCAAA 180 

AATTTAATTA AGGATGGTTG CTACTTATGT CAAAAAAGGA AAATACGACA ACAACACTAT 240 

TTGTATATGA AAAACCAAAA TCAACAATTA GTGAAAAGTT TCGAGGTATA CGTTCAAACA 300 

2Q TCATGTTTTC AAAAGCAAAT GGTGAAGTAA AGCGCTTATT GGTTACTTCT GAAAAGCCTG 360 

GTGCAGGTAA AAGTACAGTT GTATCGAATG TAGCGATTAC TTATGCACAA GCAGGCTATA 420 

AGACATTAGT TATTGATGGC GATATGCGTA AgcCAACACA AAACTATATT TTTAATGAGC 480 

25 AAAATAATAA TGGACTATCA AGCTTAATCA TTGGTCGAAC GACTATGTCA GAAGCAATTA 540 

CGTCGACAGA AATTGAAAAT TTAGATTTGC TAACAGCTGG CCCTGTACCT CCAAATCCAT 600 

CTGAGTTAAT TGGGTCTGAA AGGTTCAAAG AATTAGTTGA TCTGTTTAAT AAACGTTACG 660 

on 

ACATTATTAT TGTCGATACA CCGCCAGTTA ATACTGTGAC TGATGCACAA CTATATGCGC 720 

GTGCTATTAA AGATAGTCTG TTAGTAATTG ATAGTGAAAA AAATGATAAr AATGAAGTTA 780 

AAAAAGCAAA AGCACTTATG GAAAAAGCAG GCAGTAACAT TCTAGGTGTC ATTTTGAACA 840 

35 

AGACAAAGGT CGATAAATCT TCTAGTTATT ATCACTATTA TGGAGATGAA TAAGTATGAT 900 

TGATATTCAT AACCATATAT TGCCTAATAT CGATGACGGT CCGACAAATG AAACAGAGAT 960 

GATGGATCTT TTAAAACAAG C6ACAACACA AGGTGTTACA GAAATCATTG TAACATCACA 1020 

40 

TCACTTACAT CCTCGATATA CCACACCTAT AGAAAAAGTG AAATCATGTT TAAACCATAT 1080 

TGAAAGCTTA GAGGAAGTAC AAGCACTAAA TCTAAAGTTT TATTATGGTC AGGAAATAAG 1140 

45 AATTACCGAT CAAATCCTTA ATGATATTGA TCGAAAAGTT ATTAACGGTA TTAATGATTC 1200 

ACGCTATTTA CTAATAGAAT TTCCATCAAA TGAAGTTCCA CACTATACTG ATCAATTATt 1260 

TTTCGAATtA CAGAGTAAAG GCTTTGTACC GATTATTGCA CATCCAGAGC GGAATAAAGC 1320 

50 AATAAGTCAA AACCTTGACA TACTATACGA TTTAATTAAC AAAGGTGCTT TAAGTCAAGT 1380 

GACAACGGCG TCATTAGCGG GTATTTCCQG TAAAAAAATT AGAAAATTAG CAATTCAAAT 1440 
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GTTCTTAATG AAAGACTTAT TTAATGATAA 


GAAATTACX3T GATTATTATG 


AAGATATGAA 


1560 




CGGATTTATT AGTAATGCGA AGTTAGTTGT TGATGATAAA AAAATTCCTA 


AACX5AATGCC 


1620 


5 


ACAACAAGAT TATAAACAGA AAAGATGGTT 


TGGGTTATAA ACAGCAAATG 


AGGGGTTTTA 


1680 




TGGCACATTT ATCTGTGAAA TTGCGGCTTT 


TAATACTAGC ATTAATCGAT 


TCACTGATAG 


1740 


10 


TGACATTTTC AGTATTCGTA AGTTATTACA 


TTTTAGAACC GTATTTCAAA 


ACATATTCTG 


IBOO 


TCAAATTATT AATATTGGCA GCTATATCAC 


TATTCATATC GCATCATATT 


TCaGCATTTA 


1660 




TTTTTAATAT GTATCATCGA GCX3TGGGAAT ATGCCAGTGT GAGTGAATTG 


ATTTTAATTG 


1920 


IS 


TTAAAGCTGT 6ACX3ACATCT ATCGTTATTA 


CGATGGTGGT CGTGACAATT 


6TTACAGGCA 


1980 




ATAGACCGTT TTTTAGATTG TATTTAATTA 


CTTGGATGAT GCACTTGATT 


TTAATAGGTG 


2040 




GCTCAAGGTT ATTTTGGC6T ATTTATOGQA AATACCTTGG AGGTAAGTCA 


TTTAATAAGA 


2100 


20 


AGCCAACTTT AGTTGTTGGT GCTGGTCAAG 


CAGGTTCAAT GCTGATTAGA 


CAAATGTTGA 


2160 




AAAGTGACGA AATGAAACTT GAACCGGTAT 


TAGCAGTCGA TGATGACGAA 


CATAAACGCA 


2220 




ATATCACAAT TACTGAGGGT GTAAAAGTCC 


AAGGTAAAAT TGOGGATATT 


CCAGAACTAG 


2280 


25 


TGAG55AAATA TAAGATTAAA AAAATCATCA TTGCAATTCC AACTATTGGT 


CAAGAGCGTT 


2340 




TGAAAGAAAT TAATAATATT TGCCATATGG 


ATGGCGTTGA GTTATTGAAA 


ATGCCAAATA 


2400 




TAGAAGACGT CATGTCTGGT GAGTTAGAAG 


TGAACCAACT TAAAAAAGTT 


GAAGTAGAAG 


2460 


30 


ATTTACTAGG CAGAGATCCT GTTGAATTAG 


ATATGGATAT GATATCAAAT 


GAATTGACGA 


2520 




ATAAAACTAT TTTAGTTACG GGTGCAGGTG 


GTTCAATAGG ATCAGAAATT 


TGTAGACAAG 


2580 


35 


TTTGTAATTT CTATCCAGAA CGTATTATTC 


TACTTGGCCA TGGTGAAAAC 


AGTATTTATT 


2640 


TAATCAATCG TGAATTGCGA AATCGCTTCG 


GwAAAAATGT TGATATCGTT 


CCTATTATAG 


2700 




CGGATCTGCA AAATAGAGCG CGTATGTTTG 


AAATTATGGA AACGTATAAA 


CCATACGCAG 


2760 


40 


TTTATCATGC AGCAGCACAC AAGCACGTGC 


CGTTAATGGA AGACAACCCT 


GAAGAAGCAG 


2S20 


TACGTAATAA TATTTTAGGT ACGAAAAATA 


CTGCTGAAGC TGCTAAAAAT 


GCAGAGGTAA 


2B80 




AGAAATTCGT TATGATTTCT ACGGATAAAG 


CCGTTAATCC GCCTAATGTC 


ATGGGAGCTT 


2940 


45 


CAAAGCGAAT TGCAGAAATG ATTATTCAAA 


GTTTAAATGA TGAAACGCAT 


CGAACAAATT 


3000 




TTGTTGCAGT GAGATTTGGT AATGTACTTG 


GATCGAGAGG ATCTGTGATT 


CCACTTTTCA 


3060 




AAAGTCAAAT TGAAGAAGGT GGGCCAGTTA 


CTGTGACACA TCCTGAAATG 


ACACGTTACT 


3120 


SO 


TTATGACAAT TCCTGAAGCT TCTAGACTAG 


TTTTGCAGGC AGGGGCATTA 


GCAGAAGGTG 


3180 




GCGAAGTATT T6TGCTAGAT ATGGGAGAAC 


CAGTGAAAAT TGTAGATTTG 


GCACGTAATT 


3240 
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CCGGCGAAAA AATGTTTGAA GAGCTTATGA ATAAAGATGA GGTTCATCCT GAACAAGTAT 3360 

TTGAAAAAAT TTATCGTGGC AAAGTACAAC ATATGAAATG TAATGAAGTT GAAGCGATTA 3420 

TTCAAGACAT CGTCAATGAC TTTAGTAAAG AAAAAATTAT TAACTATGCC AATGGCAAAA 3480 

AGGGAGATAA TTATGTTCGA TGACAAAATT TTATTAATTA CTGGGGGCAC AGGATCATTC 3540 

GGTAATGCTG TTATGAAACA GTTTTTAGAT TCTAATATTA AAGAAATTCG TATTTTTTCA 3600 

CGCX3ATGAGA AAAAACAAGA TGACATTCGA AAAAAATATA ATAATTCAAA ATTAAAGTTC 3660 

TACATTGGTG ATGTGCGTGA TAGTCAAAGT GTAGAAACAG CAATGCGAGA TGTTCATTAC 3720 

GTATTCCATG CAGCAGCTTT AAAACAAGTG CCX3TCATGTG AATTCTTTCC AGTTGAGGCA 3780 

GTGAAGACAA ATATTATTGG TACAQAAAAT GTCTTACAAA GTGCTATTCA TCAAAATGIT 3840 

AAAAAAGTCA TATGTTTATC TACAGATAAG GCAGOGTATC CTATTAATGC TAGGGGTATT 3900 

20 TCAAAAGCAA TGATGGAAAA AGTATTCGTA GCCAAATCAA GAAATATTCG TAGTGAACAA 3960 

ACGCTTATTT GTGGTACAAG ATACGGTAAT GTGATGGCTT CAAGAGGATC AGTAATACCT 4020 

TTGTTTATCXJ ACAAAATCAA AGCTGGAGAA CCTTTAACGA TTACAGATCC TGATATCACA 4080 

2$ AGATTTTTAA TGAGCTTAGA AGATGCGGTA GAACTAGTTG TTCATGCATT TAAGCATGCA 4140 

GAGACAGGAG ATATTATGGT TCAAAAAGCA CCAAGCTCAA CGGTAGGGGA TCTTGCGACC 4200 

GCATTATTAG AATTGTTTGA AGCTGATAAT GCAATTGAAA TCATTGGTAC GCGACATGGA 4 260 

GAGAAAAAAG CAGAAACATT GTTGACGAGA GAAGAATACG CACAATGTGA AGATATGGGT 4320 

GATTATTTTA GAGTGCCGGC AGACTCCAGA GATTTAAATT ATAGTAATTA TGTTGAAACC 4380 

GGTAACGAAA AGATTACGCA ATCTTATGAA TATAACTCCG ATAATACACA TATTITAACG 4440 

GTGGAAGAGA TAAAAGAAAA ACTTTTAACA CTAGAATATG TTAGAAACGA ATTGAATGAT 4500 

TATAAAGCTT CAAT6AGATA GGAGAGATTG ACGTTGAATA TTGTAATTAC AGGAGCAAAA 4560 

GGTTTTGTAG GAAAAAACTT GAAAGCAGAT TTAACTTCAA CGACAGATCA TCATATTTTC 4620 

GAAGTACATC GACAAACTAA AGAGGAAGAA TTAGAGTCAG CATTGTTGAA AGCAGACTTT 4680 

GTCGTGCATT TAGCGGGTGT TAATCGACCT GAACATGACA AAGAATTCAG CTTAGGAAAC 4740 

^ GTGAGTTATT TAGATCATGT ACTTGATATA TTAACTAGAA ATACGAAAAA GCCAGCGATA 4800 

TTATTATCGT CTTCAATACA AGCAACACAA GATAATCCTT ATGGTGAGAG TAAGTT6CAA 4860 

GGGGAACAGC TATTAAGAGA GTATGCCX3AA GAGTATGGCA ATACGGTTTA TATTTATCGC 4920 

SO TGGCCAAATT TATTCGGCAA GTGGT6TAAG COGAATTATA ACTCAGTGAT AGCAACATTT 4980 

TGTTACAAAA TTGCACX5TAA CGAAGAGATT CAAGTTAATG ATCGGAATGT TGAACTAACG 5040 
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ATTGAAAATG 


GTGTACCTAC 


AGTACCAAAC 


GTATTTAAAG 


TGACATTGGG 


AGAAATTGTA 


5160 


GATTTATTAT 


ACAAGTTCAA 


ACAGTCACGT 


CTCGATCGAA 


CATTGCCGAA 


ATTAGATAAC 


5220 


TTGTTTGAAA 


AAGATTTGTA 


TAGTACGTAT 


TTAAGCTATC 


TACCTAGTAC 


aGACTTTAGT 


5280 


TAyCCCTTAC 


TTATGAATGT 


GGATGATAGG 


GGTTCTTTTA 


CAGAATTTAT 


AAAAACACCG 


5340 


GATCGTGGTC 


AAGTTTCTGT 


AAATATTTCT 


AAACCAGGTA 


TTACTAAAGG 


TAATCACTGG 


5400 


CATCATACTA 


AAAACGAAAA 


ATTTCTAGTC 


GTATCAGGTA AAGGGGTAAT 


TCGTTTTAGA 


5460 








TATGTTTCTG 


GCGACAAATT 


AGAAGTTGTA 


5520 


GACATACCAG 


TAGGATACAC 


ACATAATATT 


GAAAATTTAG 


GCX3ACACAGA 


TATGGTAACT 


55B0 


ATTATGTGGG 


TGAATGAAAT 


GTTTGATCCA 


AATCAGCCAG 


ATACGTATTT 


CTTGGAGGTA 


5640 


TAGCGCATGG 


aAAAACTGAA 


rTTAATGACA 


ATAGTTGGTA CAAGGCCTGA AATCATTCGT 


5700 


TTATCATCAA 


CGATTAAAGC 


ATGTGATCAA 


TATtTTAA 






5738 



<2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9062 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

ATCATCAACA AGAATGATAT TTTTCCCATC TACTATATCT TTTACCGCAG ATAACTTCAC 60 

TCTCACACCT TGCTCACGTA ATTCTTGAGT TGGTTGAATA AATGTTCTTG CAACATATTG 120 

35 

A TT TT TAACT AGTCCCATTT CATATGGCAA ACCTATTTCT TCAGCATAAC CACTCGCAGC 180 

TGATAGCGAT gAATTGGGTA CACCGATGAC CATATCAGCA TTTACAGGGC TTTCTTGGGC 240 

TAATTTTTTA CCAGAAGCTT TACGTACTGC ATGGACATTT TTACCAGCTA TTGTTGAGTC 300 

40 

TGGTCTAGCA AAATAAATAT ATTCCATCGC AGAAATTGCA OTTGTCGTAT GATGTGTATA 360 

AGATTTAACT GTAATACCTT TATCGTTAAT CACGACATAT TCACCTGCAT GAATATCTTG 420 

45 AACAAATTCT GCACCTAACA CATCTATTGC ACATGTTTCA CTTGCAAGGA TGTATGTCCC 480 

ATCTTTCATT TTACCTACAA CAAGTGGTCT GATAGCATTT GGATCTACTG CGCCATATAA 540 

CGCATCTTTA GTTAAAATCG CAAATGTAAA ACCGCCTTTA ACTTTTCGCA AACTTTCTTT 600 

^ CAACGCTTCC TCAAAAGTAG GAGCTTTACT TCGACGTATC AAATGCATAA TGACTTCAGT 660 

ATCAGAAGAC GAATGGAAGA TAGCACCTTG TTTTTCTAAA TTCTGACGCA ATGATTTAGC 720 
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CGGTTGAATA TTTTCAATAC CTTTATTACC TGAAGTAGCA TAACGGACGT GACCAATTGC B40 

ATGTTGATAT CCTTTTAATC GTTCCATTTG ATCATCTTTA ATCGCTTCAG TTAGTAAGCC 900 

^ TAATCCTCGC TCGCCTTTTA ATTCATTTTG ATCAGAAACA ACTATACCTG cACCTTCTTG 960 

ACCACGATGT TGCAAACTAT GAAGTCCCAT ATAtGTTAGT TGCGCTGCtT CaGGATGATT 1020 

CCAAATACCA AACACGCCAC ATTCTTCGTT TAATCCTGAG TAGTTAAACA TTGaGCAATT lOBO 

10 

GCCCCtTCCC ATArrTGTTT AATATCTGAA ACATTTTCAC TAATCTCTGT aTATGGTGTT 1140 

GTTACCTTGr aATTATCACT ATCTGTTAAA AGTCCAATTT CTATTGCATT ATCAATATTT 1200 

AAAGTTTTAC CTGATTTAAC AGAAACAACA TATCGGCCTT GCGTCTCACT AAACAATTGT 1260 

IS 

GCATTTGTTA TATCTATTGA AGATTTTAAT CXTTAAACCGT AATGC6CACT TAGTTTAGCT 1320 

AAGGTAATCA GTAAGCCACC TTTACCAACT GTTTGAACAT GTGATAATAG TCCTTCACGA 1380 

2Q ATAGCGGTCT TGATTGATTC ACCTTTTTCA ACTTCTGAAC TCAAATCTAA TGACTCAAAT 1440 

TCATGATTAA CTTTGCCATA AATTAACTTT TCAAGTTGAC TACCACCAAA GTCGTCCTTA 1500 

GTATCACCGA TTAAATATAA TTTATCTCCA ACTTGAGGTT CAAAATCATT TAAATAATTT 1560 

25 ACATTTTCAA TCAAACCTAC CATTCCAACA ACTGGTGTTG GGAAAATAGA AGTACCTTTC 1620 

GTTTCGTTAT ATAAAGATAC ATTACCAQAA ACTACTGGTG TCTTAAQAAT GTCJGCATGCT 1680 

TCTGCCATAC CTTTCGTTGA ATCTATCAAC TGTTGATAGA TTTCTTTCTT TTCAGGAGAA 1740 

30 CCATAATTTA AACAATCTGT CATTGCTAAT GGTGTTGCAC CCACGGCAAT TAAATTTCGA 1800 

TAAGCTTCAG CTACTACCAT CTTTCCACCT TCATATGGAT TGTTATATAC ATAACGCGCT 1860 

TCACCATCAA TTGTTGAAGC AATTGCCTTA TTTGTGCCTT CCACACGTAC TACCGATGCT 1920 

35 

TGAAGTCCTG GCTTAATTAT CGTATTGGCA CCAACTTGTT GGTCGTATTG ATCATATAAA 1980 

TAGTCTTTAG ATGCTATAGT CGGATOCTTA AGTAATTTAA AQAAAOTATC TTTAACATCX3 2040 

ATGTGTGTAT AATCATTTTT AGAAGTATTA TAATCTTTTT CTTCTCCTTC TAAAATATAT 2100 

40 

ACAGGTGCTT CATCAGCTAG TGGTTCAACT GGAATGTCAG CATAAACTTC GTCATCATAT 2160 

GTTAAAACAA AACGATTTGT ATCTGTAACT TCACCTATAA CAGCACTATC CAATTCGTGC 2220 

^ TTATCAAATA AATCTAAGAA TTTTTGTTCA GTACCTTTTT CAACAACTAG TAACATACGT 2280 

TCTTGAGTTT CTGAAAGCAT CATTTCATAA GGAGAAATAC CTGGCTCACG TGTTGGCACT 2340 

TGTTCTAATC TCAAATGTAA CCCACTACCA CCTTTTGCCG CCATTTCAGA CGATGAAGAT 2400 

SO GTTAAACCAG CAGCACCCAT ATCTTGAATA CCAACTAATT CATCAAATGT AATTGCTTCA 2460 

AGTGTTGCTT CCATTAATTT TTTACCTACA AATGGATCAC CGATTTGTAC AGAAGGTCGT 2520 
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CGACCAGTTT TCAAACCAAC ATAAATGACC GAATTACCTA CACCTTTTGC TGTGCCTTTT 2640 

TGAATCATGT CXSTGATTGaT AACACCAACA CACATTGCAT TAACAAGTGG ATTGCCATCA 2700 

^ TAACGTTCAT CAAATTCGAT TTCACCAGCA GTTGTTGGaA TACCAATGCA GTTACCATAA 2760 

CCTCCGATAC CCTTTACAAC ACCTTTAAGT AATCTTTGGT TTTGTTTATT ATCTAATTCT 2 820 

CCAAATCTAA GACTGTTTAA CAAATTAATA GGTCTAGCCC CAATAGAGAC AATGTCACGA 2880 

10 

ATGATTCCAC CAACGCCTGT AGCAGCCCCT TGATATGGTT CAATTGCTGA TGGATGATTG 2940 

TGAGACTCTA CTTTAAATAC TACGGCTTGA TTATCACCTA TATCGACTAC CCCTGCACCT 3000 

TCACCAG6CC CCATAAGCAC ATGGTcACCT GACGTAQ6AA ATTGCTTTAA AAACGGTTTA 3060 

IS 

GAATGTTTAT AAGAGCAATG TTCACTCCAC ATAACAGAAA AGATACCTGT TTCTGTAAAG 3120 

TTAGGTTGTC TGCCTAAAAT ATCX3CAAACT TTTTCATATT CTTGATCaCT TAATCCCATA 3180 

2Q TCTTGATATA CTTTTTCAAG TTTAATTTCT TCAAOGCTTG GTTCGATAAA TTTAGACATG 3240 

TTGTTCCCTC CAACTTTTTA CCATCGCTTC AAATAATTTC ACACCACTAT CAGTACCTAA 3300 

CAACGTTTCT AAAGCTCTTT CagGATGtGG CATCATGCCA CATACATTGC CTTTTrCGTT 3360 

25 AACAATTCCT GCAATATCAT CATATGAACC GTTCGGATTA TTCACATATT TCAGAATAAT 3420 

TTGATTGTTA GCTTTTAATT GTTGATATAT TTCATCAGTA CAATAATAAT GACCTTCACC 3480 

GTGAGCTACA GGATATATAA CTTTTTCACC TTGTTCATAA AGATTTGTAA ATGCCGTTTG 3540 

ATTATTCACT ATTTCTAACT CTTCATTTCT ACTAATAAAT AAATGTGAAT CGTTATGCAA 3 600 

TAATGCACCA GGTAATAAGC CTATTTCAGT TAAAATTTGA AACCCATTAC AAACACCTAA 3660 

TACTGGCTTA CCTTCAGCTG CAAGACGTTT AACTTCCGAA ATAATCGGsG CTACACTAGC 3720 

35 

CATTGCCCCA GATCTTAAGT AATCCCCGAA TGAAAATCCA CCAGGAATAA GTACGCCATC 3780 

AAAT€CACTT AGTGATGTTT CTCTATAATC TACATATTCC GCTTCAACAC CACTTTTAAT 3840 

AGCAGCATTA AACATGTCTC TATCACAATT CGAACCTGGA AAAACAAGAA CCGCAAATTT 3900 

40 

CATTTTATGC ATTCTCCTTT TCATCATCTA ACACTTTATA GCTATATTCT TCAATCACTG 3960 

TATTTGCAAA CAATTTTTCA CTTAGAGTTG TAATAATGTT GTGTACCTTT TCATCACTAA 4020 

^ CCTCATCCAC TGTCATATAT AATACTTTTC CTACACGAAT ATCATTCACT TGTGCATAAC 4080 

CTAAGTCATG TACAGCTCGA GTAAGCGTTT GTCCTTGCGT ATCTAATACT TGTGGTTGTA 4140 

ATGTGATATG TAGTTCAATT GTTTTCATTA TTTTAAATCC TCCAATTTGT TTAAAAATAT 4200 

SO TTGATATGTT TCAATCAGTG ATCCAGTGTT ATTTCTATAT ACATCTTTAT CAAAGTTTGC 4260 

ATTGGTAGCT TTATCCCAAA TTCGACATGT ATCTGGAGAT ATTTCATCCG CTAACAAAAT 4320 
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ATCCATTAAT TGTTTCAACA 


CATTATTAAT CTTTAATGCT 


TTGGATTTTA 


GTATTTCAAT 


4440 




ATCTTCATCT GATGCTATAT 


TGAGCAATTT AACATGGTCA 


TCCGTTATCA 


ACGGATCATT 


4500 


5 


TAACGCATCA TTTTTATAGA AAAATTCTAC AAGTGGTTCT 


CTAAAAACTT 


CACCATTTTC 


4550 




AAAACCTAAA CGCTTTGTAA 


TAGATCCACT AGCAATATTA 


CGAACAACTA 


CTTCTAATGG 


4620 


10 


AATTATTTTC ACAGGCTTAA CTAATTGTTC TGTTTCAGAT 


AATTGTTTAA 


TAAAGTGACT 


4680 


TTCTATTCCA TTTTCTTGTA 


AATATTTAAA TATAATAGAA 


GTAATTTGAT 


TATTTAATCG 


4740 




CCCCTTACCT GCCATTGTGT 


CTTTCTTAGC CCCGTTTCCA 


GCAGTAACTT 


CATCTTTATA 


4800 


IS 


TTCAACTCTT AATTCATTTT 


CTTGATTTGT TGAGAAAAT6 


CGcTTCGCTT 


TTCCTTCATA 


4860 


TAATAATGTC ATGCTTTAAT 


TACTCCCCTC AAATTTAGCG 


TACATATCTT 


GTTCAGTTTG 


4920 




GTTTACATCA TTCGTTAGTA 


CAGTCATATG CCCCATTTTT 


CTGCTATCTT 


TACGCTCAGA 


4980 


20 


CTTACCATAA ATATGTAAGT GCCACTCTGG ATGTTCATTA AATTCATTTT CCAATAAATC 


5040 




TAAATCTTTA CCTAGTAAGT 


TCATCATGAC TGCTGGCTTT 


AATAATTCAA 


TTGAATTTGG 


5100 




TAATGATTGT CCGGTAACTG 


CTAAAATATG AGTATCAAAT 


TGTGAATAAT 


CACATGCTTC 


5160 


2S 


AATTGAATAA TGTCCGGAAT 


TGTGAGGCCT TGGTGCTATC 


TCGTTCACAT 


ACAATTGGTT 


5220 




GTTACTATCT ATAAAAAATT 


CAACTGTAAA TGTTCCAATG AAATGAATCG ATTGGATAAT 


5280 




TTTATTAACT TGCTCTTTCG 


CCTCAGCTGT TTTATCTATT 


CTCGCTGGAA 


CAATTGTTTT 


5340 


30 


GAAAAGTATT TGATTTCTAT 


GCTCATTTTC TTGTAATGGG 


AAAAAAGTGA 


TTTGATTGTT 


5400 




GTTTCCTCTT GTAACAGTAA 


GAGATACTTC TTTCTTGATA 


TTCAAATATT 


TTTCAGCTAC 


5460 




GCATTCACTA GTTTCAATTA 


ATTTAAAACC TTCTTGTAAG 


TCTTTTTCGT 


TGTTAATTAA 


5520 


35 


AACTTGACCT TTGCCATCGT 


AGCCACCAAA TCTAGTTTTT 


ACAATAAAAG 


GATATCCTAA 


5580 




TGn^'CAATT GCTTTGTCAA TATCTGTAGA TTCTTTTACT GAAATGAACG 


GGACAACTTT 


5640 




GGTACCAGCA CTTTTTAATG TTTCTTTTTC AGTTAAGCGA TCTTGTAATA ACTGTATAGC 


5700 


TTGGTAACCT TGCGGAATAT 


TGTACTTTTC ACATAATAGT 


TTTAATTGTT 


GGGCTGAAAT 


5760 




GTTTTCAAAT TCATAAGTAA 


TCACATCACA TTTTTGTCCT 


AATTGATTGA 


GTGCCTTTTC 


5820 


45 


ATCGTCATAC TTGGCTTGTA 


TAAATTCGTG TGCAACX3TAT 


CTACATGGAC 


AATCTTCAGA 


5860 




AGGATCCAAT ACAACCACTT TATAACCCAT TTTTTGAGCT GATTGTGCCA TCATCTTTCC 


5940 




AAGCTGACCA CCACCAATAA 


TGCCAATAGT CGCACCAAAC 


TTTAATTTAT 


TGAAGTTCAT 


6000 


SO 


TTTGCATGTC CTCCACTTTT 


TGAATTAACG AAGATTCATA 


CTGATTTAGT 


TTTTCAACTA 


6060 




AAGAAGGATT TTGAATACTT 


AACATTCTTG CTGCAAGTAT 


ACCTGCGTTT 


TTAGCACCTG 


6120 
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10 



IS 



20 



AAGAATCTAT ACCCTTTAAA CTTTTTGTTT CAATCGGCAC TCCAATAACT GGTAGCGTCG 6240 

TTAATGATGC AACCATACCT GGTAAATGTG CCGCACCGCC AGCGCCTGCA ATGATAATGT 63 00 

TTATACCTCT TTCTCTCGCT TCAGAAGCAA ATTGAACCAT CATTTTTGGC GTACGATGTG 6360 

CGGATACTAC TTGTTTTTCG TACGGAATTT CAAAATAATC CAACATGTTA CAACTCTCTT 6420 

GCATAATTTT CCAATCGGAA GAACTGCCCA TAATGACTGC TACTTTCACT TTGTACACCC 6480 

TTTCAAAAGT TTQAATTGTG AATTACTTTA GTTGTATATT ATAGATATAG CATAACAAGC 6540 

AATTTCTGCT TTTTCAATCA AAAATCGAAC TTTATTTTGA TTTTTTATTT GAATTTAOGT 6600 

CTTTTGCTAT GTAAATTAGT TTTATAAACT AACAAAGTTA GGATATTGAC AATAGGAGGA 6660 

GAAGTTTTTA TGGTTGCTAA AATTTTAGAT GGTAAACAAA TTGCCAAAGA CTACAGACAG 6720 

GGGTTACAAG ATCAAGTTGA AGCGCTAAAA GAAAAGGGTT TTACACCTAA ATTATCCGTT 6780 

ATATTAGTTG GTAATGATGG CGCTAGTCAA AGTTATGTTA GATCAAAAAA GAAAGCAGCT 684 0 

GAAAAAATTG GTATGATTTc AGAAATCGTA CATTTGGAAG AAACAGCTAC TGAAGAAGAA 6900 

GTATTAAACG AACTAAATAG ACTAAATAAT GATGATTCTG TAAGTGGTAT TTTGGTACAA 6960 

2S GTACCATTAC CAAAACAAGT TAGCGAACAG AAAATATTAG AAGCAATCAA TCCTGAAAAA 7020 

GATGTGGACG GTTTTCATCC AATAAATATA GGGAAATTAT ATATCX3ATGA ACAAACTTTT 7080 

GTACCTTGCA CACCGCTCGG CATCATGGAA ATATTAAAAC ATGCTGATAT TQATTTAGAA 7140 

GGTAAAAATG CAGTTGTAAT TGGACGAAGT CATATTGTCG GACAACCAGT TTCTAAGTTA 72 00 

CTACTTCAAA AAAATGCATC AGTAACAATC TTACATTCTC GTTCAAAAGA TATGGCATCA 7260 

TATTTAAAAG ATGCTGATGT CATTGTCAGT GCAGTTGGTA AGCCTGGTTT AGTAACAAAA 7320 

GATGTGGTCA AAGAAGGAGC AGTAATTATC GATGITGGCA ATACGCCAGA TGAAAATGGC 7380 

AAATTAAAAG GTGACGTTGA TTATGATGCG GTTAAAGAAA TTGCTGGAGC TATTACACCA 7440 

GTTCCTGGTG GCGTTGGTCC ATTAACAATT ACTATGGTAT TAAATAATAC TTTGCTTGCA 7500 

GAAAAAATGC GTCGAGGTAT TGATTCGTAA AGAGCCTGAG ACATAAATCA ATGTTCTATG 7560 

CTCTACAAAG TTATAATGGC AGTAGTTGAC TGAACGAAAA TTCGCTTGTA ACAAGCTTTT 7620 

TTCAATTCTA GTCAACCTTG CCGGGGTGGG ACGACGAAAT AAATTTTACG AAAATATCAT 7680 

TTCTGTCCCA CTCCCTAATA ACTGAGTTTT AATGAAGTCT TTTAACCCAC ATTAAATATT 7740 

ATTTTGCAAT TGCAATGAAT AACAAGAAAA ATCTGGGACA TTAATCGATC AAATGCTCCC 7800 

50 TTCAAAGTAG ACATTGAATA AATGAAGGCT TTGAAGGGAG CATTTCACTT TGTACTTGGC 7860 

TCAACAATTT TATATAGACA GTAGTTAATT GAATGAAAAT AA6CTTGTAA CAAGTTTTCA 7920 
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GTTGGGGATG GGCCCCAACA CAGAAGCTGT GACTATGATA AAGTACTACT ACATAGTTAA 8040 

TCATTAGTGG TTCTTTATCA TTTTCGCCTC CCTTTTCTTA TTGTTTTGAT ACACAAAAAT 8100 

5 

TTAAGTTCAA ACTGTCGAAT AAAGTTATAT TTGATTTCAA ATTATCCCTA AATTATTAAT 8160 

TkTACAATTG TGGCAGATTT TCAAAATAAT AATTATTTCC TCATTATTTA TAAATTTATA 8220 

TTTAAATTTC ATTCTTTATA GGGTAAGATT AGGACTATAG TATGATGTGT ArATAATATA 8280 

10 

AATTAAGGTA TAGTAAAGCT AACTCAGAAA TGACTTATCA TTCGGAGGTT ACATTATGAA 8340 

TAAACTATTA CAGTCATTAT CAGCCCTCGG TGTTTCTGCT ACACTAGTAA CACCAAATTT 8400 

AAATGCAGAT GCAACGACGA ATACTACACC ACAAATTAAA GGCGCTAATG ATATCGTTAT 8460 

TAAGAAAGGT CAAGATTATA ACCTTCTAAA CGGCATAAGT GCATTTGATA AAGAAGATGG 8520 

AGATTTAACC GATAAAATTA AAGTCGATGG CCAAATTGAT ACATCTAAAT CTGGTAAATA 8580 

20 TCAAATTAAA TATCATGTCA CTGATTCAGA TGGTGCAATT AAAATTTCCA CTAGGTATAT 8640 

TGAGGTTAAA TAGCCCTCAT CACTATACTG CAAATAAAAT GGTAGCAAAC GAACATGTTT 8700 

TGCTACCATT TTATTTGTTA TTCTAACTTC ATCTGCAACT TTAACCCAAA TATTGTATTT 8760 

2S TTTCTGTATA CCAAAGGACT ACCTATCAAA TTATTAAAAC TTAACTGCTC TTTTTAAAAA 8820 

AATGTTTTGA TTTTGAACAA ACAAATTTCC ACTTTTCATT GTTTAACGAT AAATTACTTT 8880 

TGGCAAATTC CTTATTAAAA TGTTTGCGCT TCCTTTCAAT CAACTAGCCA TCATTTTCAA 894 0 

30 

TTTATTAGAC AATTTCAAAC TTTTTTTATT TTCATTCAAT TAACCTTTAA TTGAAAGCTA 9000 

TTCTCAACTT TCCTTTTAAA TATGAAGCAA TTTTTTCAAA AACGCTATTA GTCACAAAAT 9060 

GT 9062 

35 

(2) INFORMATION FOR SEQ ID NO: 86: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2738 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



45 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

AAATATTTTT TCAAAACTAT GTGAAAATGG aCCATGTCtA aATCATGTAA TAATGCAGyA 60 

CATAATGCCA AOGGTCTmTC TTTATTGTCC CATGCATCAT GACCAATAAA TGACTCATCA 120 

^ ATTAATCGTC TAACTATTTC ATACACACCT AAAGAATGTC CAAAGCGACT ATGTTCTGCT 180 

GTGTGAAAAG ATAGGTACAG TGTTCCTAGT TGTCTAATTC GACGTAACCT TTGGAATTCC 240 
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TCTTTAAAAA CTTTTTCTTC TACTAATTTT AAATCTACAT ATGCGTTAGT CATTATTCCC 360 

CTCCTTTTCG TTTAATATAA TATTTAATTT ACTTAAAATG CTTTGTACAT AAGTGCTAAG 420 

TCTAACTTTT CGCCATACAT TTCTGGCTCA TAAGAGCGTA AGATTGTAAA ACCTTGCTCT 480 

TTATAGTAAG CTACTGCTTC TTCATTTTTA TTATCTACTT CTAAGTAAAC ACCTTCAAAT 540 

TTATCTTCAA AACGTGATAA TCCTTCATTT AACAATGCTG TACCATAACC TGTATGTTGC 600 

GATTCTGGTT TAACATAATG AGCTGATAAA TATAATTCTT CACCGTAAAT AAAGTTAGCA 66 0 

AAGCCAACGA TGTCATTACC TTCTTCAACG ACTAAGAATA ATTGTTCTTG AAGTCTTTTC 72 0 

TTTAAATGAT GTTCATTATA TGAAGCTtCT AACAAGTGAT TAACTGTTGT CGCAGCGTAT 780 

ATATTTAAGT ATGTATTAAA CCAAGCTTTA GTTGCGACAT CTCTAATTTG AACAACATCT 840 

TTTTCAGTTG CTTGTCTTAC CTTGAACATG ACTTTCTCCC CTTATTAACA AGTTTTAATA 900 

ACGGCATTAT ACCACAACTT GCTCAATACT TAATAAACAA TGATTGTCTA TTCAATTTAT 960 

ATATtTATAT TTTCCGTTAA AATTAAAAAT AAAAAATAAC GAAGCAAAAA AtCACTTOGT 1020 

TTAGTATGAG GTATGTCTTA TTGCAATATA CTATTCCACT CAGTTGCACG TGCTAAGGCA 1080 

2s TAGTTGTCTT TCATGATGTC ACCAGGCTTT TCAGCAGTTC CAATAATATA ACCATTTAAA 1140 

GTGGCACCTA rAAAGTCTAA ACTATATTTC ATTTGCGTAA TTGCTGGTTC GCTTTTATTT 1200 

TTGGACAATC TCCACCAACT AAAATAACTC TAAAATCCTT TTCGGCCATT TGTGCCTTAA 1260 

30 AATTAGGATA TCGTTTATCT TGTAATGTTT CTGACCAATG TTCGATAAAT GCTTTCAATG 1320 

GTGCTGAAAT GCTATACCAA TACACTGGTG ATGCAAAAAT AATTGTATCA CTAGCCAATA 13 80 

TTTTATCTAG AATCGGCAAA TAGTCATCGT CATATGAAGT AATAGTCTCT GCTGTATGTC 1440 

TCACGTCACG TATCGGTTTA AACTGATGTT GTGTCACGTC AATCCATTGA TACTCTAAAT 1500 

CTTGCAAAGC GAATTTTGTT AATTGTGCAG TATTACCGTT TGGTCTACTC CCACCAAACA 1560 

AAACAGTAAT CATTTTAGCC TAACCTCACT TTTGATTAAT AAATATCTGT GTTTTTCGTT 1620 

ACCTAATTAT ACTATCATAA GCTTTGCCTA CCGAATAGTA AAACGCTTAC AACTTTTATA 1680 

TAAATTTGAC GAAATTTCGT CATGCCTTAT ATAACGTCGT TTGTGATACG GGGCTAATTC 1740 

ATGATGAAAT TAGATACATA TATCACCATT AAATACAATT CATTTAGTCT TCAATCGGAA 1800 

ACAGTTCATC GATATATTGA ATCTCATCAT CTGATAAAAC GATATCTGCA GCTTTAATAT 1860 

TTTCAACGAC TTGTTCTGCA CGTTTTGCAC CAGGAATAAT CACATCGATA GCTGGTCTCG 1920 

so TTAAATAAAA TGCTAATACA ATGTTCGCAA TTGAAGTTTG ATGTGCTGCA GCTATGCTTT 1980 

CCAAAGCTTT TACGCGACGC ACATTTTCTT CAAATACACC TGGTTTAAAA TCACGACGTG 2040 
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GCTAATGGGA AATATGGAAT AAATGTGATT TGGTGATCAA CACAATATTG TAATACTGCC 2160 

TCATTTTCXjC GATGCAATAA ATTATATTCT AACTGTACAA CATCAACGTA ACCATCTTTA 2220 

5 TTTGCTTCTT TAAGTTGATC TAATGTGAAA TTTGATACAC CAATTGCTTT AATCTTCCCT 2280 

TGTTCCTTAA GCTCTTGTAA TGCTGCAACT GCTTGATCTT TCGGAGTGTT GTTATCCGGA 234 0 

AAATGAATAT AATATAAATC GATATAATCA GTTTGTAGAC GTTTCAAACT ATTCTCAACT 2400 

10 

TGTTGTTTTA AATATTCCGG TTGATTGTTC TGATGTACTT CTTGATTTTC ATCAAATTCA 2460 

TGAGACCCTT TCGTAGCAAT TTTAATTTGC TCTCGCGGAT ATTCTTTAAC AACTTCTCCA 2520 

ACCAATTCTT CTGATOGTTC TGGCCCATAA ATATATGCCG TATCTAATAA ATTAATACCA 2580 

75 

TGATTAATGG CTTGACGAAC AACATCTTTT CCTTGTTCTT CATCTAAGTT CGGATATAAA 2640 

TTATGCCCAa CCTAtGCGTT CXSTCCCAAGT GCGATTGGAA ACACTTCAAC ATCAGATTTA 2700 

2^ CCTAAGTTTA CAAATTGCTn CATTAGACCC AGCnCCTT 2738 

(2) IKFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9425 base pairs 
2S (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 



35 



40 



45 



GATTAGATGA 


TATTTAACGA 


AAATTAaGrT 


GmAATACTtG 


AATGTArGAa 


GTCTGATGTC 


60 


GAAAATAGCT 


ATTAAAATAG 


AGTAGACGTA 


ATGtAAATGA 


AAGCACCTAA AATAGAAAAA 


120 


TTTCAAAAAT 


AGCGTAATTA 


TTATAATAAA 


TAGACTGCCA 


ATAAAATGCA ATTTTTCACT 


180 


TATAACATTC 


TTCAAAAAAT 


AATA6CAAAA 


TTATGTAAAA 


AATATCTTGT 


CAT6GCAAGA 


240 


TTGGCTGTGC 


TATAATCTAT 


CTTGTGCTTA AGAACGGCTC 


CTTGGTCAAG 


CGGTTAAGAC 


300 


ACCGCCCTTT 


CACGGCGGTA 


ACACGGGTTC 


GAGTCCCGTA 


GGAGTCACCA 


TTTTTTAGGT 


360 


CTCGTAGTGT 


AGCGGTTAAC 


AOGCCTGCCT 


GTCACGCAGG 


AGATCGCGGG 


TTCGATTCCC 


420 


GTCGAGACCG 


TACAAATGCC 


TATCCAAGAG 


GATAGGCATT 


TTTTTGCGTT 


TAATATTATA 


480 


TTAATAAAAG 


ATATATGGAC 


GAATGATAAT 


CATATTGATT 


TATCTGTTCG 


TCCATTTTCT 


540 


TTAAAATGTA 


TGAACCTCAA 


GTAACTTAGT 


GGTTGGATAT 


GAAA6ATAAA 


CGTAGACAAT 


600 


AAAATCTTTA 


TTAGACGTAC 


AAACATATGC 


TACTGTCAAC 


ATATTTCTTC 


GTTGTGATAT 


660 


GCCACCA6TC 


CTCCATAACA 


TCAATTGTTA AAGTAACGAA 


TAACGAATAA 


TGATATTTAT 


720 
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GACCTCATCA TTGTGTTAAA TATCATTGTC ACAATCCGCC GTGAGAAACT AATAAAAAAT 840 
AGTAATATAT AAGTTTATAT TGGAAAATAG AATTAATAGC TTATAAATGG TAAATTATAT 900 
^ AATAGGTTAC TATACGTTAT AAGACGGAAA ATGCGCACAA TAACAAAAAT AGTAAGCGAC 960 

ATCCT6TGAT TTTTTACACA AACATAAACG ATAAAGAACA AAAAATGATA AAATAATATT 1020 

AATGATTTAA GAAAAGAGGT TTATGCAAAT GGCTAGAAAA GTTGTTGTAG TTGATGATGA 10 80 

10 

AAAACCGATT GCTGATATTT TAGAATTTAA CTTAAAAAAA GAAGGATACG ATGTGTACTG 114 0 

TGCATACGAT GGTAATGATG CAGTCGACTT AATTTATGAA GAAGAACCAG ACATCGTATT 1200 

ACTAGATATC ATGTTACCTG GTCGTGATGG TATGGAAGTA TGTCGTGAAG TGCGCAAAAA 1260 

IS 

ATACGAAATG CC3UITAATAA TGCTTACTGC TAAAGATTCA GAAATTGATA AAGTGCTTGG 1320 

TTTAGAACTA GGTGCAGATG ACTATGTAAC GAAACCGTTT AGTACGCGTG AATTAATCGC 1380 

ACGTGTGAAA GCGAACTTAC GTCGTCATTA CTCACAACCA GCACAAGACA CTGGAAATGT 1440 

AACGAATGAA ATCACAATTA AAGATATTGT GATTTATCCA GAC3GCATATT CTATTAAAAA 1500 

ACGTGGCXSAA GATATTGAAT TAACACATCG TGAATTTGAA TTGTTCCATT ATTTATCAAA 1560 

25 ACATATGGGA CAAGTAATGA CACGTGAACA TTTATTACAA ACAGTATG6G GCTATGATTA 1620 

CTTTGGCGAT GTACGTACGG TCGATGTAAC GATTCGTCGT TTACGTGAAA AGATT6AAGA 1680 

TGATCCGTCA CATCCTGAAT ATATTGTGAC GCGTAGAGGC GTTGGATATT TCCTCCAACA 1740 

^0 ACATGAGTAG AGGTCGAAAC GAATGAAGTG GCTAAAACAA CTACAATCCC TTCATACTAA 1800 

ATTTGTAATT GTTTATGTAT TACTGATTAT CATTGGTATG CAAATTATCG GGTTATATTT 1860 

TACAAATAAC CTTGAAAAAG AGCTGCTTGA TAATTTTAAG AAGAATATTA CGCAGTACGC 1920 

35 

GAAACAATTA GAAATTAGTA TTGAAAAAGT ATATGACGAA AAGGGCTCCG TAAATGCACA 19 BO 

AAAAGATATT CAAAATTTAT TAAGTGAGTA TGCCAACCGT CAAGAAATTG GAGAAATTCG 2040 

TTTTATAGAT AAAGACCAAA TTATTATTGC GACGACGAAG CAGTCTAACC GTAGTCTAAT 2100 

40 

CAATCAAAAA GCGAATGATA GTTCTGTCCA AAAAGCACTA TCACTAGGAC AATCAAACGA 2160 

TCATTTAATT TTAAAAGATT ATGGCGGTGG TAAGGACCGT GTCTGGGTAT ATAATATCCC 2220 

AGTTAAAGTC GATAAAAAGG TAATTGGTAA TATTTATATC GAATCAAAAA TTAATGACGT 2280 

45 

TTATAACCAA TTAAATAATA TAAATCAAAT ATTCATTGTT GGTACAGCTA TTTCATTATT 2340 

AATgCACAGT CATCCTAGGA TTCTTTATAG CGCGAACGAT TACCAAACCA ATCACCX3ATA 2400 

so TGCGTAACCA GACGGTCGAA ATGTCCaGAG GTAACTATAC GCAACGTGTG AAQATTTATG 2460 

GTAATGATGA AATTGGCGAA TTAGCTTTAG CATTTAATAA CTTGTCTAAA CGTGTACAAG 2520 
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GTGATGGTAT TATTGCAACA GACCGCCGTG GACGTATTCG TATCGTCAAT GATATGGCAC 2640 

TCAAGATGCT TGGTATGGCG AAAGAAGACA TCATCGGATA TTACATGTTA AGTGTATTAA 2700 

5 

GTCTTGAAGA TGAATTTAAA CTGGAAGAAA TTCAAGAGAA TAATGATAGT TTCTTATTAG 2760 

ATTTAAATGA AGAAGAAGGT CTAATCGCAC GTGTTAACTT TAGTACGATT GTGCAGGAAA 2820 

CAGGATTTGT AACTGGTTAT ATCGCTGTGT TACATGACGT AACTGAACAA CAACAAGTTG 2880 

10 

AACGTGAGCG TCGTGAATTT GTTGCCAATG TATCACATGA GTTACGTACA CCTTTAACTT 2940 

CTATGAATAG TTACATTGAA GCACTTGAAG AAGGTGCATG GAAAGATGAG GAACTTGCGC 3000 

CACAATTTTT ATCTGTTACC CGTGAAGAAA CAGAACGAAT GATTCGACT6 GTCAATGACT 3060 

TGCTACAGTT ATCTAAAATG GATAATGAGT CTGATCAAAT CAACAAAGAA ATTATCGACT 3120 

TTAACATGTT CATTAATAAA ATTATTAATC GACATGAAAT GTCTGCX3AAA GATACAACAT 3180 

20 TTATTCGAGA TATTCCGAAA AAGACGATTT TCACAGAATT TGATCCTGAT AAAAT6ACGC 3240 

AAGTATTTGA TAATGTCATT ACAAATGOGA TGAAATATTC TAGAGGCGAT AAACGTGTCG 3300 

AGTTCCACGT GAAACAAAAT CCACTTTATA ATCGAATGAC GATTCGTATT AAAGATAATG 3360 

25 GCATTGGTAT TCCTATCAAT AAAGTCGATA AGATATTCGA CCGATTCTAT CGTGTAGATA 3420 

AGGCACX;TAC GCGTAAAATG GGTGGTACTG GATTAGGACT AGCCATTTCG AAAGAGATTG 34 80 

TGGAAGCGCA CAATGGTCGT ATTTGGGCAA ACAGTGTAGA AGGTCAAGGT ACATCTATCT 3540 

30 

TTATCACACT TCCATGTGAA GTCATTGAAG ACGGTGATTG GGATGAATAA TAAGGAGCAT 3600 

ATTAAATCTG TCATTTTAGC ACTACTCGTC TTGATGAGTG TCGTATTGAC ATATATGGTA 3660 

TGGAACTTTT CTCCTGATAT TGCAAATGTC GACAATACAG ATAGTAAGAA GAGTGAAACG 3720 

35 

rAACCTTTAA CGACACCTAT GACAGCCAAA ATGGATACAA CTATTACGCC ATTTCAGATT 3780 

ATTciTTCGA AAAATGATCA TCCAGAAGGA ACGATTGCXSA CGGTATCTAA TGTGAATAAA 3840 

CTGACGAAAC CTTTGAAAAA TAAAGAAGTG AAGTCCGTGG AACATGTTCG TCGTGATCAT 3900 

40 

AACTTGATGA TTCCTGATTT GAACAGTGAT TTTATATTAT TCGATTTTAC GTATGATTTA 3960 

CCGTTATCAA CATATCTTGG TCAAGTACTG AACATGAATG CGAAAGTACC AAATCATTTC 4020 

45 AATTTCAATC GTTTGGTCAT AGATCATGAT GCTGATGATA ATATQGTGCT TTATQCTATA 4080 

AGCAAAGATC GCCACGATTA CGTAAAATTA ACAACTACAA CGAAAAATGA TCATTTTTTA 4140 

GATGCATTAG CAGCAGTGAA AAAAGATATG CAACCATACA CAGATATCAT CACAAACAAA 4200 

GATACAATTG ATC6TACGAC GCATGTTTTT GCACCAAGTA AACCTGAAAA GTTAAAAACA 4260 

TATCGCATGG TATTTAACAC GATTAGTGTT GAGAAAATGA ATGCTATACT ATTTGACGAT 4320 
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GCAAACTATA AC6ATAAAAA T6AAAAATAT CATTATAAAA ACCTGTCCGA AGATGAAGCG 4440 

AGTTCCAGCA AAATGGAAGA AACGATTCCA GGAACCTTTG ATTTTATTAA TGGTCATGGT 4500 

GGTTTCTTAA ACXSAAGACTT TAGATTGTTT AGTACGAATA ATCAGTCAGG CGAGTTAACA 4560 

TATCaACGTT TCCtTAATGG TTATCCAACG TTTAATAAAG AAGGTTCTAA TCAAATTCAA 4620 

GTCACTTGGG GTGAAAAAGG CGTCTTTGAC TATCGTCGTT CGTTATTACG CACCGACGTT 4680 

GTTTTAAATA GTGAGGATAA TAAATCGTTG CCGAAATTAG AGTCTGTACG TTCAAGCTTA 474 0 

GCGAACAATA 6TGATATTAA TTTTGAAAAA GTAACAAACA TCGCTATCGG TTACX3AAATG 4800 

CAGGATAATT CAGATCATAA TCACATTGAA GTGCAGATTA ACAGTGAACT C6TACCGCGT 4860 

TGGTATGTAG AATATGATGG CX3AATGGTAT GTTTATAACG ATGGGaGGCT TGaATAAATG 4920 

AACTGGaAAC TGACAAAGAC ACTTTTCATT rrCGTGTTTA TTCTTGTCAA CATCGTGTTA 4980 

GTATCGATTT ATGTTAATAA AGTCAATCGC TCACACATTA ATGAAGTCGA GAGTAACAAT 504 0 

GAAGTTAATT TTCAGCAAGA AGAAATTAAA GTACXX3ACTA OTATATTG/^ TAAATCAGTT 5100 

AAAGGTATAA AATTAGAGCA AATTACAGGG CGATCAAAAG ACTTTAGTTC TAAAGCTAAA 5160 

25 GGCGATTCGG ATTTGACCAC ATCAGATGGT GGAAAATTAT TGAATGCGAA CATTAGTCAA 5220 

TCGGTAAAGG TCAGTGACAA TAACTTAAAA GATTTGAAAG ATTATGTTAA CAAGCGCGTA 5280 

TTTAAAGGTG CTGAATATCA ATTAAGCGAG ATTAGTTCAG ATTCTGTAAA ATATGAACAA 534 0 

ACGTATGATG ATTTTCCGAT TTTAAATAAC AGTAAAGCGA TGTTAAACTT TAATATAGAA 5400 

GATAACAAAG CGACTAGTTA TAAACAATCA ATGATGGATG ACATTAAGCC CACAGATGGT 5460 

GCAGATAAGA AGCATCAAGT GATTGGTGTG AGAAAAGCAA TCGAGGCATT ATATTATAAT 5520 

CGTTACTTGA AAAAAGGTGA TGAAGTCATT AATGCTAGAC TCGGTTACTA CTCAGTCGTG 5580 

AATGAAACGA ATGTTCAATT GTTACAACCA AACTGGGAAA TTAAAGTGAA GCATGACGGT 5640 

AAGGATAAAA CGAATACTTA CTATGTCGAA GCGACAAATA ATAACCCTAA AATTATTAAT 5700 

CATTAATATG AATCGTAATA AGCTAGCATT GCAAGCTCAT CATATGTGAG AAGCX3GTGCT 5760 

AGCTTTTTTG CTGGTACXiGT TTATTATGGC TGATGTTTTT GCGTCTCCAA CGTGCX5CATT 5820 

TATTCATATT TTAAGTAGAA CCGCATTGTA AAATTAGTGT AACTGTTATT TTAAAAACTT 5880 

TAGTATTTGT CTAATCATTG TTATAATAAT TAAGAAATTC ATTGCACGTG ATTATCAAAA 5940 

TTTAAATATA AGAAACCGGT CGATGAACTA AAGTTACATA ATAGGAAAGG TATACAAAAC 6000 

SO AGCTAATATA CTGATAGTTT CTGTAGGGAA AATCX3TATAT TTGCACTGAT GTATATTGCA 6060 

GTCATATAGA GAGATTGACT GTTTAAAGAG AAAGGATGAG CCGCTTGATA CGCATGAGTG 6120 
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TAGTTGATGT 
ATATTCAAGA 
TAGGTGTTTT 
CAATTGAAAA 
AAACAAAATC 
ATCCGCAATT 
GTTACGTGTC 
6TAATCATGA 
TTTTAGGCGA 
TTACAGGTAA 
ATTTGGCGCG 
AAGTATTGCT 
GAGTCATCCG 
TGTTGGTGGG 
TCGATATTAA 
TATGATAATA 
TAAAACAAGT 
CGCGGCTGAG 
AACAGATAAG 
GGCAACCAAC 
ATCC^CGCA 
CGACGTAGAT 
ATCAAATGCT 
AACAACACAT 
GCGTGTCATC 
GTTAGACGCA 
AATGGCTAAA 
TGACTTTGGA 
TAcTAACGTT 



TGGTTTGACT 
TTTAAATGGT 
GGCGCGTAAA 
GAAAGATAGT 
TATTGCAGGT 
TTATATTTTC 
TGATCGTATG 
CGTCGATATG 
TATGGGTCAT 
CACGAAACGT 
TATGAGTGTT 
ATGTGATACX; 
ATAAAGTTCC 
AAATGGCTGT 
ATGTAATTTA 
TATTGGTGTA 
GTATGGCTCX5 
CAGCATACAC 
CAACAAGTAC 
GTATCAGCAT 
CCATCTAACA 
ACACAACAAG 
AAAACAGCAT 
AAAATATTAC 
GGTATGGCTA 
GGAGACGCCT 
GCAATGAATG 
TACGATCAGT 
TATAAAGATG 



OGAAAGAAAA 
ATTTTAGTAA 
TATCAATTGC 
CGCATCCCTA 
TTCGATGTTG 
CATAATAACT 
AAAGGTATGA 
TTGAGAATGT 
GTATCTAATG 
ATTTACCTAT 
GGCCAAGTAT 
GATAAAGCTA 
GCATTGCTGT 
TGTTGAGTTG 
TAAATAATTT 
TGACAGTTAA 
TTTTGCTTTT 
CAATGAAAGC 
CGCCAACAAA 
CAGCGCAGGG 
AACCATCTAC 
CCTCAACACA 
CACTTTCACC 
ATACAAATGA 
AATTAAAAAC 
TCCAAGGTTT 
CAGTAGGTTA 
TGAAAAAGTT 
GAAAACGCGC 



TGGAAGAATT 
CCCATGAACA 
CAATTTATGC 
TGGATCAGAA 
AATCGTTTAA 
ATAAGAAGTT 
TACGTGGCAG 
GTCX5TTATCC 
AGGATGCGGC 
CGCATTTATC 
TGAACGAACA 
TTCCAACGCC 
GAGACGACTT 
AATCGGCTTG 
ACATAAAATC 
TGGAGGGAAC 
TAGTGTAATG 
ACATGCAGTA 
GGAAGCGGCT 
AACAGCTGAT 
AGTAGTTTCA 
AAAACCAACT 

ACGAATGrrr 

TATCCATGGC 
AGTAAAAGAA 
ACCACTTTCA 
TGATGCTATG 
AGAGGGTATG 
GTTTAA6CCT 



GTTTAGTCAA 
TATTGATCAT 
GAATGAAAAA 
ATTCATTTTT 
CGTGTCACAT 
TACGATTTTA 
CGATGCGTTT 
ATGGAAGACG 
TCATGCAATG 
ACAAGACAAT 
CGATATTGAT 
AATATATACA 
TATCGGGTGC 
ATTGAAATGT 
AATCATTTTA 
GAAATGAAAG 
GGATTATGGC 
ACAACGATAG 
CATCATTCTG 
GATACAAACA 
ACAAAAGTAA 
CACACA6CAA 
GCTGCTAATG 
CGACTAGCCG 
CAAGAAAAGC 
AACCAGTCTA 
GCAGTCGGTA 
TTAGACTTCC 
TCAAC6ATTG 



ATTGACCGTA 
ATTAAAGGAT 
ACTTGGCAGG 
AATCCTTATG 
GATGCAATAG 
ACGGATACGG 
ATTTTTGAGA 
AAACAACGTA 
ACAGACX5TGA 
AACATGAAA6 
ACX3GAAAAAG 
ATATAAATGA 
TTTTTTATGT 
GTAAAATAAT 
ATATAAGGAT 
CTTTATTACT 
AAGTCTCGAA 
ACAAAGCAAC 
GCAAAGAAGC 
GCAAAGTAAC 
ACGAAACACG 
CGTTCAAATT 
CACCACAAAC 
AAGAAAAAGG 
CTGATTTAAT 
AAGGTGAAGA 
ACCATGAATT 
CGATGCTAAG 
TAACAAAAAA 
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20 



TGAAGGCATT AAAGGCGTTG AATTTAGAGA TCCATTACAA AGTGTGACAG CGGAAATGAT 8040 

GCGTATTTAT AAAGACGTAG ATACATTTGT TGTTATATCA CATTTAGGAA TTGATCCTTC 8100 

AACACAAGAA ACATGGCGTG GTGATTACTT AGTGAAACAA TTAAGTCAAA ATCCACAATT 8160 

GAAGAAACGT ATTACAGTTA TTGATGGTCA TTCACATACA GTACTTCAAA ATGGTCAAAT 8220 

TTATAACAAT GATGCATTGG CACAAACAGG TACAGCACTT GCGAATATCG GTAAGATTAC 82 80 

ATTTAATTAT CGCAATGGAG AGGTATCGAA TATTAAACCG TCATTGATTA ATGTTAAAGA 8340 

GGTTGAAAAT GTAACACCGA ACAAAGCATT AGCTGAACAA ATTAATCAAG CTGATCAAAC 8400 

ATTTAGAGCA CAAACTGCAG AGGTAATTAT TCCAAACAAT ACCATTGATT TCAAAG6AGA 8460 

AAGAGATGAC GTTAGAACGC GTGAAACAAA TTTAGGAAAC GCGATTGCAG ATGCTATGGA 8520 

AGCGTATGGC GTTAAGAATT TCTCTAAAAA GACTGACTTT GCCGTGACAA ATGGTGGAGG 8580 

TATTCGTGCC TCTATCGCAA AAGGTAAGGT GACACGCTAT GATTTAATCT CAGTATTACC 8640 

ATTTGQAAAT ACGATT6CGC AAATTGATGT AAAAGGTTCA GACGTCTGGA CGGCTTTCGA 8700 

ACATA6TTTA GGCGCACCAA CAACACAAAA GGACGGTAAG ACAGTGTTAA CAGCGAATGG 8760 

25 CGGTTTACTA CATATCTCTG ATTCAATCCG TGTTTACTAT GATATAAATA AACCGTCTGG 8820 

CAAACGAATT AATGCTATTC AAATTTTAAA TAAAGAGACA GGTAAGTTTG AAAATATTGA 8880 

TTTAAAACX5T GTATATCACG TAACGATGAA TGACTTCACA GCATCAGGTG GCGACGGATA 8940 

TAGTATGTTC GGTGGTCCTA GAGAAGAAGG TATTTCATTA 6ATCAAGTAC TAGCAAGTTA 9000 

TTTAAAAACA GCTAACTTAG CTAAGTATGA TACGACAGAA CCACAACGTA TGTTATTAGG 9060 

TAAACCAGCA GTAAGTGAAC AACCAGCTAA AGGACAACAA GGTAGCAAAG GTAGTAAGTC 9120 

TGGTAAAGAT ACACAACCAA TTGGTGACGA CAAAGTGATG GATCCAGCGA AAAAACCAGC 9180 

TCC^GTAAA GTTGTATTGT TgtAGCGCAT AGAGGAACTG TTAGTAGCGG TACAGAAGGT 9240 

TCTGGTCGCA CAATAGAAGG AGCTACTGTA TCAAGCAAGA GTGGGAAACA ATTGGCTAGA 9300 

ATGTCAGTGC CTAAAGGTAG CGCGCATGAG AAACAGTTAT TTCATAATCA ACAGTCATTG 9360 

ACGTAGCTAA GTAATGATAA ATAATCATAA ATAAAATTAC AGATATTGAC AAAAAATAGT 9420 

AAATA 9425 
(2) INFORMATION FOR SEQ ID NO: 88: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3886 base pairs 

50 (B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

AGTTGTAATG TCACATTTCC AGAGTCTGAA ATTATCTTTA TCACGTTACA TTTACTAGGC 60 

TCTAAAATGA CTGAACATAC AGCATCTTCA ATTACCTTTG AATACCATGA TTTATCGCAA 120 

AATATACATG AATTGATCAC TTGTGTTAGC CAAGAATTAG GCATTGATAT GTCAAAAGAC 180 

AACAAGTTAC ATACCAGTCT GATCACACAT ATCAAACCAG CTATACATCG TATTAAATAC 24 0 

GATATGCTAC AACCTAATCC TTTGAGGCAA GAAGTTATGC GTCGCTATCC TCAAATCATT 300 

GAAGCCGTTA GCAAGCATAT TAGTCCAATT GAACAAGATG CTGCTATTCG CTTCAACGAA 360 

GATGAATTAA CATACATTAC AATTCACTTC GCATCAAGTA TAGAGCGTGT TGCAACACAT 420 

AAACAATCAA TGATTAAGGT TGTCTTACTA TGTGGTTCTG GTATAGGCAC GTCACAACTT 480 

TTAAAATCAA AACTAAATCA CCTGTATCCT GaGTTnCACA TTTGGGAtGC CTATTcCATT 540 

TaTcAATTGG aAGaAAGTCG ATTATTGCAA GATAACATTG ATTATGTCAT TTCAACAGTA 600 

CCTTGTGAAA TATCAGCTGT ACCAGTTATT CATGTCGATC CATTTATCAA TCAACAATCT 660 

CGTCAAAAAT TGAATCAAAT TATCAATGAC TCAAGAGAAC AACGAGTCAT GAAAATGGCA 720 

ACTGATGGCA AGTCACTCGC AGATTTATTG CCTGAACATC GCATCATTAT AAATAAACAA 780 

CCATTATCAA TTGAATCCGC AATTGCAGTG GCTGTGCAAC CTTTAATCAA TGATGGCATT 840 

GTCTATTCAA ATTATACAGC TGCAATTTTA AAACAATTTG AACAATTCGG GTCATATATG 900 

GTCATTAGTC CACATATTGC ACTTATTCAC GCTGGTACTG ATTATGTACA GAATGGTGTA 960 

GGTTTCGCAC TAACATATTT CACTGAAGGG ATTATCTTTG GTAGTAAAGC TAACGATCCC 1020 

GTTCACCTTG TAATTACATT AGCAACGGAC CACCCCAATG CACATTTAAA GGCATTGGGA 1080 

CAGTTAAGCG AATGCTTAAG CAACGACTTA TATCGACAAG ATTTCTTAGA TGGGAATATT 114 0 

TTT^PWVTTA AACAACACAT TGCTTTAACT ATGACAAAGG AGGCTTAATA ACGTGTCATT 1200 

AGACATTTTG TCAACAACAC GCATCATTGT AAAAGAACAA GTAAATGATT GGACTGAAGC 1260 

TATAACTATA GCTTCTCAGC CATTACTACA AGAACAAATT ATTGAACAAG GCTATGTTCA 1320 

AGCAATGATT GATAGCGTTA ATGAACTTGG ACCTTATATC GTTATCGCAC CTGAAATTGC 1380 

AATTGCACAT GCAAGACCGA ACAATGACGT ACATCAAGTT GGTTTAAGTC TATTAAAGTT 1440 

GAATCAACAT GTGGCATTTT GTGATGAAQA TCACTACGCA TCTCTCATTT TTGTATTGAG 1500 

TGCCATCGAC AATCATTCAC ACTTATCTGT ATTACAAAAT TTAGCAACCG TACTGGGCGA 1560 

TAACCAAACA GTCCAGCAAC TATTAACTGC AACAAATGCA CAAGACATTA AAAACATTTT 1620 

AAAGGAGCAT GATTAATATG AAAATTTTAG TAGTATGTGG CCACGGTTTA GGAAGTAGTT 1680 
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AAGTTGAACA TAGTGACATT ATGACAGCAA GTCCAGAGAT GGCTGACTTG TTTATTTGTG 1800 

GTAGAGATTT AGCTGAAAAT GCCGAACGTC TAGGGGATGT CTTAGTTCTT GATAATATTT 1860 

TAGATAAAGC TGAATTACAA CAAAAGCTCT CAGAAAAATT ACAACAACTT AACATGATTT. 1920 

AAAGGAGGTA CGACCTATGC AAGCAATCCT TAATTTTATA GTCGATATTT TAAGTCAACC 1980 

AGCCATTCTT GTTGCACTGA TTGCCTTTAT AGGTTTAATC GTTCAG/U^ AACCTGCCGC 2040 

AACGATCACT TCAGGAACCA TTAAAACGAT ATTAGGCTTC TTAATTTTAA GTGCAGGTGC 2100 

TGATGTCGTC GTTC36ATCTC TTGAACCATT CGGCAAAATA TTCCAACACG CATTTGGTGT 2160 

GCAAGGTATC GTACCTAACA ACGAAGCTAT CGTCTCACTA GCCTTAAAAG ATTTTGGAAC 2220 

AACAGCTGCA CTCATCATGG TCTGTGGCAT GATTGTTAAT ATTTTAATTG CCCGCTTCAC 2280 

TAATTTAAAA TATATCTTTT TAACAGGTCA TCATACATTT TACATGGCTG CGTTTTTAGC 2340 

20 AATCATTTTA ACA6TCAGTC ATATTAAAGG CTGGCTAACG ATTGTTATCG GCGCACTCGT 2400 

ATTAGGATTA ATCATGGCAG TATTACCTGC ATTACTCCAA CCTACGATGC GAAAAATTAC 2460 

AGGGAATGAC CAAGTAGCTT TAGGTCATTT TOGCTCAATC AGTTACTTTG CCGCAGTQCT 2520 

2S GTAGGTCAAT TATTCAAAGG TAAGTCTAAA TCAACGGAAG AGATTAAATT TCCAAAAGGC 2580 

TTAAGTTTCT TACGAGAAAG TACAATTAGT ATCTCGATTA CGATGGCATT ACTTTACTTC 2640 

ATCGCATGCT TATTTGCGGG CGTTAGTTAT GTACACGAAT CTATTAGTGA TGGTCAAAAC 2700 

TTTATTGTCT TTTCATTAAT TCAAGGTGTG ACATTTGCTG CTGGTGTATT TATTATTTTA 2760 

ACGGGCGTTC GTTTAATCTT AGCTGAAATC GTCCCAGCAT TTAAAGGAAT TTCTGAAAAG 2820 

CTTGTACCAA ATTCTAAACC TGCATTAGAC TGCCCTATTG TGTTCCCTTA TGCACAAAAT 2880 

GCAGTATTAA TTGGATTCTT TGTCAGCTTT ATTACAGGTG TCATCGGTAT GTTTATCTTA 2940 

TTCTTATTTG GTGGCGTCGT CATTTTACCT GGCGTAGTTG CACACTTCTT CTTAGGTGCA 3000 

ACGGCTGCTG TATTCGGTAA TGCAAGAGGC GGTATTAAAG GTGCTATTGc TGGCGCCGCT 3060 

CTAAATGGTA TCCTAATCAC GTTTTTACCA TTATTATTCT TGCCATTTTT AGGCGAATTA 3120 

GGTGGTGCTG CAACAACATT CTCA6ATACA GACTTTTTAG CT6TCGCTAT CGTGTTCGGT 3180 

45 AACGCAGTAA AATATATGGG ATTATTTGGT GCGATTCTAT TTATTATTAT CGTAGGTGCG 3240 

ACAACAATTT TATTAAAAGG CCX3TCAAAAA GAACAGCAAT AGTGTTAACG TAGAAATATA 3300 

AAACACCGTC ACATATTGAG TGAATGCCCC TTTtATCAAG AGGAAAGCCA CTTACTTAT6 3360 

GACGGTGTTT TGTATTATAT TAAATGATAC TTAGCCATAC TATCGACAGC TGCTAAAATT 3420 

GCTTCTTCTT GTGTCGCAAT CGGTTCCCAA CCAAGTAATG TTTTTgCACG TTCGTTACTT 3480 
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CCTAGACTCA AAATAAAGTC TGGTAATTTT TTAGTAGAAA CTTTTTGAGC TATTTCAGGT 3600 

CTCTTTTCTT TAATTAATTT TGCAATTTCC AACAAATTAA TTTGTCCATC AGCCX3TCGCA 3660 

^ ATAAATCGCT TGCCATTAGC TTGTTCATTT GTCATTGCCA AAATGTGCAG TTCAGCTACG 3720 

TCTCTCACAT CAACAACATT TAACGGAATT TGCGGTACAC GTTTCATTGA ACCATTCAAT 37 BO 

AAATTTTCTA ATAAATGAAA GCTTCCTGAA ACGTGTGCAT CTAATGATGG CCCAAAAATT 3840 

10 

GCAACTGGAT TGATTGTGGC AAATTCTACT GTTGTATTTT CATTCT 3886 
(2) INFORMATION FOR 5EQ ID NO: 89: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GTCATCTATC AAAAATTTGG TATACAGACC GACAATTATT AATTAATAAT TTAATTTCCC 60 
25 AGGCAATACC AGTGATTAAA TATCCACAAA TACAACATAA AGAACAACCA TTAGAATCTA 120 
TTTCACAACT TATATTGTCT AAGATGACAT CTAATCAATA GTGTTTAAAT TTCTCAGTGG 180 
CTGTGAATGA GGTTTAAAAG TACTATAAAA OGTAAACTTT GATACTTTAA AATACGCAAA 240 
AAACGGTAAA CCCTAATTCA TATTATAGAG TTTACCGTTT TATTTTTTAA CTTGCATCAT 300 
AGTTATATTA ACATTATTGT TGGTAGTTTG GATCAGTAAC CATTGCTTGT CCAGTATAAT 3 60 

CAACCGTTAC AATTGAATAT TTTCCaTTTG CATTTGGGTC TTTAAAACTA AACACATACT 420 

35 

TATAGTTGCC ATTATGTTCT TCAATAGAAT AATCATTATA CACTTTATTA TTACTACCAA 480 

ATTT&TTTGC TTCATTATTA GCCGCATTTA AAGCTGTTTG GAAATTTGGC AATTGCTGTA 540 

AAGCfTTGATT TTTATTTCCA TTAAACGGAT AAATTTGACG TGCAACCGGC GCGQCATTTT 600 

40 

GnCCATAATA TGGTGCAACG TAACTTGATT TTTGATTATT ATTCGCTTGG TTATTACTTG 660 

ATTGGTTATT ATTTGTTTGG TTTTGGTCAT TGTTTGTTGC ATTTGAATTA GATTGTTGCT 720 

GGTTATCGTT TGCACTATTA TCTTTATTAT CTTTGTTTAC GTCTTTACTA TCATCTTTAT 780 

45 

TATCTTTCTT ATCTTTAGAT GAATCATTTG TTTTTTTATC TTGTTGTTCA GTTTTCGCTT 840 

TATCATCTTT TTCTTTATTA CCGTCTTTTT GTTGGTCACT ATCTTGACCA CATGCAGCTA 900 

SO AAAATAATGA TAATGCTAGT AACCCTGTAA CTAATCTTTT CATACATATC TCCTCCTATA 960 

ATTCGATATT CATTGAATAA TCTTGAAATA CATATCTACC ATGTGTATCT TTTCATGGCT 1020 

SS 
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TAAGGTTCTT TTTATTATAC CCTAATTTTT GTTCATTATT ATTTAATTTT TGTGAATTTT 1140 

ATGtTTkCTA TAAATTTAAT TATTTTACTT TAACAATTCA TTACGCATTT AGCATTTCAA 1200 

^ GGTATACACA ATATTTATTA CTATGATTTC ATTTTATCTG CTGCAAAAAC AATCATTATA 1260 

ACTCTTTTTC CATAATTAAA TCTGTATCCG TTACATCACC TGTTTGAAAA TGATGTTCAC 1320 

CAACCACTTT AAATCCATGA CGTTTATAAA ATGCTTGAGC ACGAGGATTA TGCTCCCAAA 13 80 

10 • 

CTCCTAGCCA AATTTTATGT TTATTATGTT CTTGAGCAAT TTTTTCGGCC AATTCTATCA 1440 

ATTGTGAACC TCTTCCGCCA CCTTGAAAGT CTTTCAAAAA ATATATGCGC TGCACTTCTA 1500 

AATAGGTCTC CCCCATTTCT TCAGTTTGAG CACTATTAAT ATTCATCTTT ATATAACCAA 1560 

75 

CATTOGCACC ATCTTCTTGa TAAAAATAAT GAAATGAATC TACATGGTTA ATCTCTT6TG 1620 

TAAATTTCTC TACAGTATAA TTGTCTTTAA AAAATTGATC AAAATCTTTG TCATCATACT 16B0 

2Q AAGAACCAAA CGTGTCATAA AATGTTCTAG TTGCTAATTC AACTAATTCA CTAGCATTTT 1740 

GTTCTGAAAT TTCTTTGATT ATCCCAGCCA TATAAATCCT CCAATAAACA GTGATCGAAT 1800 

CAAAATATTA CTTATGTTAT TTTTCAGCCA AAACTATTTA AAAATACATT AACACAAATC 1860 

25 AATTACAAAT TGTATTGATT GTGTGTAACA TCAATAAATG ATACATTTAT TCCAGTAAAA 1920 

TGGCCGTATT TTCAAAAGA6 AAAAAGAGAG GATGTATCX3T TGTGATAGAA ACATTTAAAG 1980 

CGTTTGTAAT TGATAAAGAT GAGAGTGGTA AAGTGACACC AACTTTCAAA CAATTATCGC 2040 

CTACTGATTT ACCTAAAGGA GATGTGCTGA TTAAAGTACA TTACTCTGGT ATAAATTATA 2100 

AAGATGCTTT AGCGACTCAA GATCATAATG CAGTCGTAAA ATCGTATCCT ATGATTCCAG 2160 

GAATAGATTT AGCTGGAACA ATTGTTGAAT cCGAAGCACC AGGCTTTGAa AAAGGAGAAC 2220 

35 

AAGTAATTGT AACGAGTTAT GACCTAGGTG TCAGCCATTA TGGCGGTTTT AGTGAATATG 2280 

CGCCTGTAAA ATCAGAATGG ATTATCAAGC TTCCTGATAC TTTAACATTA GAAGAATCAA 2340 

TGATATATGG CACAGCTGGT TATACTGCCX3 GTTTAGCAAT TGAAAGACTT GAAAAAGTT6 2400 

40 

GAATGAATAT TGAAGATGGT CCTGTACTCG TTCGCGGTGC TTCAGGTGGT GTCGGTACTT 2460 

TAGCAGTACT CATGCTTAAT GAACTTGGTT ATAAAGTTAT CGCAAGTACA GGTAAACAAG 2520 

^ ATGTTAGCGA TCAATTACTT GAACTTGGTG CCAAAGAAGT TATCGATCGA CTTCCTGTTG 2580 

AAGATGATCA TAAAAAGCCA CTCXtCATCAT CAACTTGGCA AGCTTGTGTA GACCCTGTTG 2640 

GTGGCGAAGG TATTAATTAT GTTACAAAGC GTTTAAATCA TAGTGGGTCA ATTACAGTTA 2700 

50 TTGGTATGAC TGCCGGTAAT ACTTATACTA ATTCTGTATT CCCTCACATT TTAAGAGGTG 2760 

TAAACATTTT AGGAATTGAC TCGGTATTTA CTGCTATGAA ATTAAGACAG CGCGTTTGGC 2820 
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TTGATGAACT TCCAGAACAA CTTAACAAA6 TAATTAAACA TGAAAATAAA GGGCGCATTG 2940 

TTATCGATTT CGGTGTAGAT AAATAGTATT CATGAAAAAG ACATCCCGTT ATGCGAGATG 3000 

5 TCrrTTTTAA TTTAGTATTT GATATACATA CCGCCTGAAT CTGGTTCGGT AGGTATAAAT 3060 

CCAAATTTTG TATATAATTT ATCCGCTGGG TAGTCTGCAA TCAGAcTAAC GTATGTACTC 3120 

TCAACAGCCA CACCTTTAAT ATATTGCATA ATATGCTCCA TAATTAGACT GCCGTAACCT 3180 

TGACCTTGGT AACTTTTCAA AACTGCAATA TCAACAATTT GAAAAACAGT TCCGCCATCG 3240 

CCAATCACTC TACCCATACC AATTAACCGA TCTTTATCAT ACAAGGTTAC TGTAAATAAG 3300 

GCATTAGGTA ATC C TTTTT C aGCTGTTCGC GCGTCTTTGG ACTCATACCT GCGTTAATCC 3360 

IS 

TTAATGCGCA ATAATCCTCG CAAGTCGGAA TATCATATGT CACTTTAACC ATTATTTACC 3420 

CCACTTTTOV TCACACAATA TATCAACCTA GTATAAATGT TTATTTACAA TAGTCTTATT 3480 

CGCTTCTTTA AACACTTCAT GATGACTTGA AACATAACCC TCTGCATTCO CATCTGGTTG 3540 

20 

GATATATGTT TTAGCAAGGT TCGCTGCATT TGCACCATCA CTAAATGCAC TTGCAATTAG 3600 

ATGTGATTTT GCATCATGAT AAACAATATC TCCACACGCA TAGATACCAG GTATACTAGT 3660 

25 TGTCGTATTA CCAAATCCTT TAACACX3ACA ATCATCATGC ATATCTAGCT TTGAAGATGT 3720 

TtCACTCAAT AATGTATTAC AACGATCAAA CCCATGACTA ATAATGACAT CX3TCAAATTT 3780 

AACTGTATGC CTATCGCCAC TTTCAACATG TTCCAAAACA ACTTCACTTA TATGCGTTTC 3840 

50 ATCATCATTG CCGACCAAGT ATTTAATACO TGTTTTTGGG CATAGTTTCA CATTTAAATC 3900 

TGTCACCAAC GTTTTCATC6 CTTCATGACC ACTTACATCT TCTTTTCGAT AAACAACTGT 3960 

CACGCTTTTA GCAATCTTGG CAATATCATG CGCCCAATCT AATGCTGTAT TTCCTCCACC 4020 

TGATATTAAT ACATCTTTAT CTTTGAAACG TCTGTAACTT TGTACAACAT AATGTAAATT 4080 

AGTrSATTGA TATCTCTCTA CACCTTTAAC ATCTAATTGT TTTGGATTAA TAATACCCGC 4140 

ACCAATTGCA ATGATAACTG CTTTCGATGT ATATATTTCT CCCGCTTCTG TTTCAACTTC 4200 

40 

GAAATGACGT TCTGCCTTTT TCCTAATATC TACCACACGT TCATTCAAAT GAACTTCCGG 4260 

TTTAAAATAT AATCCTTGCT TAATTGTATC TTTTAAAATT TCATGACAAG GTTTTGGCGC 4320 

AATGCCGC€:A ATATCCCAAA TAATTTTTTC AGGGTAAATT CTCATCTTAC CCCCTAATTC 4380 

45 

AGATTGAACA TCTATCAATC TTACAGACAT ATCTCGCAAT CCAGCATAAA AGCTTGCATA 4440 

CAAACCAGAC GGACCGCCAC CAATGATTGT AACATCTTTC ATTATGTGCC TCCTATGACT 4500 

so CTCTATATTC ATTTCTTTCA TTAACXSTGCT CAAATTGATA ATTATTATCA TTTAAAGCCA 4560 

TTATACTATT AATATTTATA TTGTTAAAAT AAATCGCATA GTTAGCCATG AATTATCAAT 4620 
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GAAAGATGTG TATATTTTTT AGTTCTAGTT ATATTATTTT TTAAAAGACT CATCACGTGG 



4740 



TTCTTTAAGA ATTGCTTGTC TTAAAAGGAA AAATAGCAAC AATAAACCTG CAAGCATACC 



4800 



TGTGTGCCCA ATACCTGCAA AGCCTGCnAA TGCTTCTGGA GAGTATGATT TACCAGTGAC 



4860 



TTGGAAGAAT CCTTTTGTC 
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SO 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1560 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

ATAATGTCTT AGaTTGATTG GGAGTTTTTT TAATTTTTTT GAAATTAAAT TAATCTGTAs 60 

yTAATAAAAA ATTTGAATAA CTGACACAyT TTTTTGATCA TAGCTAyATA CTTTGTGAAT 120 

TAATTCACAT TATAATAAGA GTGAAGATAA GAGTATTATA AATnATCTTT AAATAAATAT 180 

ATGTGAAGTA AAAATTACAC GTTAGCATAT CGATTATGgT CATTTCkTTT AACATATTAA 240 

CTgGGGaACG TTAAAAGTTA ACGGkTGATA TCyAACtAAA AACAAGGTCA CAGTAGTATG 300 

TTTTAATCTG GCGTCTATTA CAAATAAAAA TTACATCTAT AATTATTCGT TTTCTTTTTT 360 

GAAAGTAATA GCCAATTAAT ATCATACATA CTGGAGTGAC TATAAGGAGG ACATTATTAT 420 

GAGAGCAGCA GTTGTAACGA AAGATCACAA AGTAAGTATT GAGGACAAAA AGTTAAGAGC 480 

TTTAAAACCT GGTGAAGCGT TGGTACAAAC GGAATATTGT GGCGTTTGTC ATACCGATTT 540 

ACATGTTAAG AATGCTGATT TTGGTGATGT TACAGGCGTT ACTTTAGGTC ATGAAGGTAT 600 

TGC^AAGTC ATCGAAGTTG CGGAAGATGT AGAATCATTA AAAATTGGAG ACCGTGTGTC 660 

TATCGCTTGG ATGTTCGAAA GCTGTGGAAG ATGTGAATAT TGTACAACAG GTCGTGAAAC 720 

ACTTTGCCGT AGTGTGAAAA ATGCTGGTTA TACAGTAGAT GGTGCAATGG CTGAACAAGT 780 

TATTGTTACT GCAGACTATG CTGTGAAAGT ACCTGAAAAA TTAGATCCAG CAGCAGCGTC 840 

TTCTATTACA TGCGCAGGTG TGACAACTTA TAAAGCTGTA AAAGTAAGTA ATGTAAAACC 900 

TGGACAATGG TTAGGTGTTT TTGGTATAGG TGGTTTAGGT AACCTAGCTT TACAATATGC 960 

TAAAAACGTT ATGGGGGCTA AAATTGTTGC CTTCGACATC AATGATGATA AATTAGCATT 1020 

CGCGAAAGAA TTAGGTGCTG ATGCTATTAT TAATTCTAAA GATGTTGATC CAGTTGCAGA 1080 

A6TTATGAAA TTAACTGATA ACAAAGGATT AGATGCAACA GTGGTAACTT CAGTTGCTAA 1140 
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TTTACCTGTT GATAAAATGA ACTTAGATAT CCCAAGATtA GTGCTTGATG GTATTGAAGT 126 0 

AGTAGGTTCA CTTGTTGGTA CAAGACAAGA CTTACGTGAA GCGTTTGAAT TTGCTGCTGA 1320 

AAATAAAGTA ACACCTAAAG TTCAATTAAG AAAATTAGAA GAAATCAATG ATATTTTTGA 1380 

AGAAATGGAA AATGGTACTA TAACTGGTAG AATGGTTATT AAATTTTAAA AATATCAACT 1440 

GACTATATAG ATAAAGAAGG TAGTGCTCTG AACACTATCA TTATTAATCA AACCCCGAGG 1500 

TTTTCCTGAA AAGATAGTGG nAAATCCCCG TGTTTTTTGG GTTTGAGGnG GTTGTnTGTA 1560 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11014 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(O) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

GTCCTGTnGC TGCAATGAAT ACGCCTAAAA ATCCAGGGAT GTAATGGATA CTTTGTGGTA 60 

GTACTAATGA TAGAAATGAT AAAAATGAAA TCACAAAGGC TACGCTCGCA AAAGCTTGAC 120 

ATGTACGCTT ATOGCCATAA TCTAACCCTG TACGTATATG TAATAAATAC TGTAATCCGA 180 

TACTTAAATA CATAATTGCC ACGCATAAGA AGAATGGGAA GAATGTCTTT TCAAAGTCCG 240 

GATATAGGCT GTTAGATAGG AAGACCATGA TGAACATATT AAACATCATA AACGAGACGT 300 

CTTTGAATGT AACTTGACCA AATCGATTTG TAAAAAATGT TTGATGAGAC CACATTAACC 360 

ATAAGAACAA ACTCATGACG ATGTATTTGA AAAATAAATC AGCTGAAATG GAACCGTTTT 420 

GTGTTGTTAA AATCACATGT GCAATTTTTT GAATGGCATA GACGAAAATT AAATCAAAGA 480 

ACAACTCATG GAATCCTGCA CGCTTTTCAG CTAAATGTTT TGGTGTTAAT GCATTAACCA 540 

TAAAATTTTA ACTCCTTTAA GATGTGTAAT TAATTTACTA AGTATACTAT TTATTTTTTC 600 

TAGTGAATAG GGGCAGATTT GGCGATGAAG TGGAAGGAGA GGTGACTGCA AOGTAATTGC 660 

GGAATTAACA ATCATCAGCG ATTTAATATT TGACTGGAGA CGTCATGGTA ATAAAAAATT 720 

4S GATGAGAAAT TGATGGTGAA ACCAGCTGTG AATAsCGaTG cAATGATrsA TAGaATTTAA 780 

TTAGAGTCAT TACGCGaAAT GATTAATGAT AATTTGTGGT AAATCAAAGC aTAATTTTGT . 840 

ACTATAGATG AGGATGATAG AGCATATTTA AGAGGGTGAA ATGTTAAAGT GAAACCGTTT 900 

ACGTTTCCGA TTGCCCAAAC AAATTACATC ATTGTATAAT ATGATTTGTT AAATGCATAA 960 

CAAGAATGAA AATGTAACAT ACGTAGCAAT TGGTTTCATA AATTGGATGT TAGTGGCGTA 1020 
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TGACGAGAGT CGTATTAGCA GCAGCATACA GGACACCTAT TGGCGTTTTT GGAGGTGCGT 114 0 

TTAAAGACGT GCCAGCCTAT GATTTAGGTG CGACTTTAAT AGAACATATT ATTAAAGAGA 1200 

CGGGTTTGAA TCCAAGTGAG ATTGATGAAG TTATCATCGG TAACGTACTA CAAGCAGGAC 1260 

AAGGACAAAA TCCAGCACGA ATTGCTGCTA TGAAAGGTGG CTTGCCAGAm ACAGTACCTG 1320 

CATTTACGGT GaATAAAGTA TGTGGTTCTG GGTTAAAGTC GATTCAATTA GCATATCAAT 1380 

CTATTGTGAC TGGTGAAAAT GACATCGTGC TAGCTGGCX^G TATGGAGAAT ATGTCTCAAT 1440 

CACCAATGCT TGTCAACAAC AGTCOCTTTG GTTTTAAAAT GGGACATCAA TCAATGGTTG 1500 

ATAGCATG6T ATATGATGGT TTAACAGATG TATTTAATCA ATATCATATG GGTATTACTG 1560 

CTGAAAATTT AGTAGAGCAA TATGGTATTT CAAGAGAAGA ACAAGATACA TTTGCTGTAA 1620 

ACTCACAACA AAAAGCAGTA CGTGCACAGC AAAATGGTGA ATTTGATAGT GAAATAGTTC 1680 

20 CAGTATCXSAT TCCTCAACGT AAAGGTGAAC CAATCGTAGT CACTAAGGAT GAAGGTGTAC 1740 

GTGAAAATGT ATCAGTCX3AA AAATTAAGTC GATTAAGACC AGCTTTCAAA AAAGACGGTA 1800 

CAGTTACAGC AGGTAATGCA TCAGGAATCA ATGATGGTGC TGCGATGATG TTAGTCATGT 1860 

CAGAAGACAA AGCTAAAGAA TTAAATATCG AACCATTGGC AGTGCTTGAT GGCTTTGGAA 1920 

GTCATGGTGT AGATCCTTCT ATTATGGGTA TTGCACCAGT TGGCGCTGTA GAAAAGGCTT 1980 

TGAAACGTAG TAAAAAAGAA TTAAGCGATA TTGATGTATT TGAATTAAAT GAAGCATTTG 204 0 

CAGCACAATC ATTAGCTGTT GATCgTGAAT TAAAATTACC TCCTGAAAAG GTGAATGTTA 2100 

AAGGTGGCGC TATTGCATTA GGACATCCTA TTGGTGCATC TGGTGCTAGA GTATTAGTGA 2160 

CATTATTGCA TCAACTGAAT GATGAAGTTG AAACTGGTTT AACATCATTG TGTATTGGTG 2220 

GCGGTCnAAC TATCGCTGCA GTTGTATCAA AGTATAAATA ATAAGAAAAC AGGTTATCAC 2280 

AACA(jrATTA ATtACATGTT GGCATAACCT GTTTTTATTT GTTTATGGAT TTATTGGGTA 2340 

ATATTAGTCA TTTGATGGTT TAATTGCAAA TGCTCTAACA GGGAACCCAG GTGCATCTTT 2400 

TGGTTTAGGG CTGATAGCGT AAATGATGGC GCCACGAGTT GGTAATTGAT CTAAATTAGT 2460 

TAATAACTCG ACTTGGTATT TATCCTGACC AAGAATATAA CGTTCGCCAA CTAAATCACC 2520 

45 ATTTTTTACA ACGTCCACAG ATGCATCGGT ATCGAATGTT TCATGACCAA CAGCTTCAAC 2580 

ACGACGTTCT TCAATTAAGT ACTTCAAAGC ATCTAATCCC CAACCCGGTG CATGTTGTTG 2640 

TCCGTTCGCA TCTTTGTTTT CAAACTTTTC AATATTAGGC CAACGTTTTG ACCAATCGGT 2700 

ACX3AAGTGCA ACAAAAGTGC CAGGTTCAAT AGTACCATGC TCTTTTTCCC ATGCTTCTAT 2760 

ATGCGCACGT GTTACGATGA AATCATTGTT GTTCGCTACT TCTGTTGAAA AGTCTAATAC 2820 
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AAAGTGAATT GGTGCATCAA TGTGAGTACC ATATTGCGTT ACAATATTCC AACGTTGCAC 2940 

ATAGAAACCA TGATCTTTAA CCGTGAATAA AGTTGAAACT TCGCCTTTTT CAAACTCACT 3000 

AAAACGTGGT ATTTCCGGAT CAAATGTATG CGTTAAATCA ACCCAAGTTG CTTGTTTTAA 3060 

AGTATTTAAT TGTTGCCATA AAGGATATTG TGTCATAAAA TCACCCGTTT TTAGTTTATT 3120 

ATATGATAAA TGCTGCGATT ATTCTTGGCG TTTAGCTTTA ACAGCATTCA CAAGCACAGT 3180 

CAATGCATCT TTAACTTCTT CTTCTTTTCG CGTTTTTAAA CCACAGTCAG GGTTTACCCA 3240 

GAATAATGAG CGGTCGATTT GTTGTAGTGA ACGATTGATT GCTGTAGTAA TTTCTTCTTT 3300 

TGTTGGAATA CGTGGACTAT GAATATCATA TACACCTAGA CCAATACCTA AATCATAATT 3360 

AATATCTTCA AAGTCTTTAA TTAAATCACC ATGGCTACGA GATGTTTCAA TTGAAATAAC 3420 

ATCAGCATCT AAGTCATGAA TAGCATGAAT GATTTGACCG AATTGAGAAT AACACATATG 3480 

20 TGTATGGATT TGAGTTTCAT CACX5AACTGA AGACGTTGCA AGTTTAAATG ATAAAACAGC 3540 

ATCTTTAAGA TATTGTTCGT GATATTCAGA GCGTAATGGT AAGCCTTCAC GTAATGCAGG 3600 

TTCGTCAACT TGGATAACTT TGATTCCTGC AGCTTCAAGT GCTAATACTT CTTCGTTGAT 3660 

2S TGCTAAAGCA ATTTGATCTT GAACXaVCTTT ACGTGGTAAA TCAACACGTT CAAATGACCA 3720 

GTTTAGAATT GTTACAGGTC CAGTTAACAT ACCTTTAACT GGTTTATCTG TTAAGCTTTG 3780 

TGCATAAACT GTTTCATCAA CAGTTAAAGG CGCTGTCCAT TTTACATCAC CATAAATGAT 3840 

TGGTGGTTTT ACGGCACGTG AACCATATGA TTGCACCCAA CCGAATTTAG TTACTAAGAA 3900 

ACCTTGTAAT TTTTCTCCGA AGAATTCAAC CATGTCATTA CGTTCAAATT CACCGTGAAC 3960 

TAATACATCT AAGCCAATGT CTTCTTGAAT TTTAATCCAT CGAGCAATTT CATTTTTTAA 4020 

GAATGTTTCA TATGCTTCGT CTGTAATGCG TTTGTTCTTC CAATCTGCAC GGTATTTTCG 4080 

AACTTCTCGG CTtTGTGGGA ATGATCCAAT AGTTGTTGTT GGTAAATCCG GTAAGTTCAA 4140 

ACGTTTTTGT TGTTGTTCAA TACGTTGCX3C GAATGGTGAT TGTCTTGAAG TACGCACGCT 4200 

TTCGAAATCA TAATCTAAGT TTTTGAATGA TTGATTTTGG AAACGCTCAT AACGTGCTTT 4260 

TAATTTATCA TATTTAACAC TATCGTTTTG ATTAAATAGG Oy^CGCAATG CATCTAATTC 4320 

45 GTCTAATTTT TCAGTTGCAA AGCTTAAGCC TTCGCCAACA CTTGTATCTA ATGTTTCATC 4380 

ATCTAAAGAT ACTGGAACAT GTAATAATGA AGATGATGGT TGAATGACAA GTTCATTAGT 4440 

6TGTGCTAAC AATTTATOGA TTAAGACTTT TTTAGCTTCA ATGTCACTTG CCCATACATT 4500 

SO ACGACCATCA ATAATTCCAG CGTATAATGT TTTTGATTTA TCAAAATCTC CAGCTTCAAT 4560 

TTGTTTAAGG TTATAGCCAT TATCATGGAC AAAGTCTAAA CCTATACCAC CAACAGGTAA 4620 
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AACACCAGCT TTTTCGAAAT AGTCATAAGC TTCACGTGTA ATATTTTCAT AGCTTTCGCT 4740 

GTCGTCTGTA ACTAAGATTG GCTCATCAAC TTGAATGTAC TCAGCACCTG CATCAATTAA 4 800 

5 

TGATTCAAAC ACTTCTTTAT AAAGTGGTAA TAACGTTTTA ACTTTTTCTT CAAAAGTTTG 4 860 

GTGACCGCCT TTTGATAATT TAACAAAAGT AATCGGACCA ACAATGACAG GGTGAGCGTT 4920 

AACGTTTAAA GATTGGGCAT ATTTAAAGCG ATCTAATAAT ACATTGCGAC TCACTTTAGG 4980 

10 

CTCAACATTG TCCCATTCAG GTACGATGTA ATGATAGTTA GTGTTAAACC ATTTTATAAG 5040 

TGCACTTGCA ACATGGTCTT TATTACCGCG AGCAATATCA AATAATAAAT CATCATCAAT 5100 

AGTTCTTCCT TGGAAACGTT CAGGGATGAT GTTGAATAAT AATGACGTAT CTAATATATG 5160 

GTCATATAAA GAGAAATCAC CAACTGGGAT GCTATCTAAG TGATAGTACT TTTGtAATAA 5220 

TAAATTTyCT TTATGTAGAT CAGTTAATGT TTGATCTAAT TCTTCTTTAG AAATCTTCTT 5280 

20 TGCCCAATAA CTTTCGATGG CTTTTTTCCA TTCTCTTTTT CTACCTAATC TTGGGAATCC 5340 

TAAGTTTGAT GTTTTAATTG TTGTCATAAT ATTGCCTCCT TGTGAGCAGT AATAGATTTT 5400 

GAGTATGCTG CAAGTTCTAA TGAATCTTCG ACATTTTGAA ACGGTGTGAT AATGTATAAA 5460 

25 CCATTAAAAT ATTCATGAAC AGTATCGATT AAATCCTTTG AAAGCTTAAG ACTTAGTTCT 5520 

CX3TGTTTTGG CTTTATCATC TTTAACTGCT TCAAATTGTT GTAAAATTTC ATCTGACATC 5580 

TTGATTCCTG GCACTTCATT ATGCAAAAAG AGTGCGTTTT TGTAACTTGC GATAGGCATA 564 0 

30 

ATGCCTATGA AAAATGGTTT GTTCAAGTGC TTAGTGGCAT GGTAAATTTC AATGATTTTC 5700 

TCTTTGCTGT ACACGGGTTG TGTTATAAAA TAAGACATTC CGCTTTCTAT CTTTTTCTCT 5760 

AATCTTTTGA CGGCACCATA TAATTTACGA ACATTAGGGT TAAAGGCGCC AgcGATGTTG 5820 

35 

AAGTGTGTAC GTTTCTTCAG CX3CATCACCG TCAGTGTTAA TACCTTGATT AAATCTTAGA 5880 

GCGfiGTTCAG TTAATCCTTT AGAATTAACA TCATAGACAT TGGTTGCACC TGGTAAGTGA 5940 

CCAACTTTTG AAGGATCACC AGTTATGGCT AATATTTCGT TAACGCCAAT GAGCGATAAT 6000 

40 

CCAAGTAAAT GGGACTGCAA GCCGATTAAG TTTCGGTCTC GACATGTAAT ATGTACGAGT 6060 

GGTTCAATAT TGTAATATTG CTTAATTAAG CTAGCAGCAG CAATATTGCT AATTCTGACA . 6120 

45 GTTGCCAATG AATTATCTGC GAGTGTTACC GCATCTACAT TAGCTTTATC AAGTTTAGCG 6180 

ATATTTTCAA AAAATCTATC CGTGTCTAAA TGTTTCGGTG TATCCAATTC GATAATAACX5 6240 

6TTGGACGTT CTTGAACCTT AGATGTTAAT GATTGTCTAA CTTTATTTTG AGATGGATTG 6300 

SO AAAAGTGCTT TCGTTGGTAT CGGAATCACT TTTTTGTCAT TAACAGGTTT AAGTGTCTGA 6360 

ATAGATTCTT TAATAAATTT GATGTGCTCT GOCGTTGTAC CACAGCAACC ACCAATTAAA 6420 
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TACTTAAATT CACTATTTTC AATATCTAAT AAGCTGGCAT TTGGATAACA AGATAAGAAT 6540 

GCGTGCTCTG GTAATTCAAT ATGTGTGAAA GACTCTTGCA TATGGTGCGG GCCATGATGA 6600 

CAATTGAGTC CCACGATGTT TGCACCACAT TGAACGAGTT GTTTTAATCC TTCATTGATT 6660 

GCCTGACCAT TAACTAAGTA ATTTGTGTTT GAAGCGGTTA ATTGAGCAAT GATTGGAATG 6720 

TCGTATTTCT TTCTCGTTCG TGAAATGACA TTTGTTAACT CTTCTAGGTC GTAATACGTT 6780 

TCGAAAAGTA GCGCGTCAAC GCCTTCTTCA ATTAAGGTGT CTATTTGAAT TTCAGTATGA 6840 

TAAAGAATAG TTTGTAAGCT GATATCCTCT TGTTTGATAC CTCTAAACCC ACCAACTGTG 6900 

CCTAATATAT ACGTATCTTT ATTTGCTGCT TTTTTTGCGA TGCGAACGGC GGCTTGATGT 6960 

ATTGCTTTAA CTTTATCTTC AAGACCGAAT CGTTTTAACT TTTCAAAATT TGCACCATAA 7020 

GTATTGGTTT GAATGACATC AGCACCGGCT TCAATATAT6 AACGATGGAT GCGTTCAACT 7080 

20 TTATCTGGAT GGCTAAGATT ATATGCTTCT GGACAGGTGT CTAATCCTTC AGAGTATAAA 7140 

ATGGTTCCTA TAGCX5CCATC AGCTACTAAA ACATTATCTT TCAATTGTGT GAGGAATTGA 7200 

CTCATTGAAT GCCTCCTTTA ATGCGTATTT GATGTCTGCA ATGAGTTCAT CAGGATCTTC 7260 

25 GAGACCAACA CTTAATCGGA ATAQACCGAA AGTGATACCA CGTTCTTOTC TCACTTCTTC 7320 

AGGTAGTGCA GCGTGAGACA TTGTTGCTGG ATGTQAAAGG ATCX5TTTCAA CACCGCCCAG 7380 

ACTCACTGAA ACGAGTGGTA ATGTCAGTGC ATCGACAAAT TGTTGTGCTT TAGACTCATC 7440 

AGCTAAACGA AAGCCAATAA CGGCACCGCC ATTTTTAGCT TGTTCTAAAT GAGCAGTAGT 7500 

GAGTCCCGGA TAATAAACTT CTGAAATTTC ATCTTGCTTT ATTAAAAATG ACACGATTTT 7560 

TTGAGCGTTT TCGACAGATT GTTTAAATCT GATTGGAAAA GTTTTTAAAT GTTTAGCAAG 7620 

TGTCCAGCTA TCCTGAGCAG ATAACATATT GCCTGTACCA TTTTGTATTA AATAAAGAGC 7680 

GTCftCTAATT GCCTCATTAT TAGTTATGAC AGCACCAGCA ATTAAATCGC TATGTCCACT 7740 

TAAAAATTTT GTAGCACTAT GAATGACAAT ATCAGCGCCA AGTAATAAAG GTGATTGACc 7800 

TAACGGTGTC ATAAATGTAT TGTCCACAGC TACCAGTAGT TCATGCTTTT CGGCTATTTT 7860 

AGAAACAGCT TTGATATCAG TAATTTTAAA ACAGGGATTC GATGGTGTTT CGATATAAAT 7920 

45 TAATTTTGTG TTTGATTGAA TGGCACCCTC GATTTGTTCG AGCTTTGTAG TATCTACGGT 7980 

TGTAAATTCA ATATTAAATC GATTCAAAAT TTGCTCAGTG AGGCGAAAAG TACCGCCATA 8040 

TACATCATCG GGTAAGAT6A CATGATCACC AGATTTGAAA GTCAAAAGTA CTGCTGAAAT 8100 

50 AGCAGCAATA CCTGATGCAA AAGCAAAAGC GAATTTTCCC TGTTCTAATC GTGCTAACTT 8160 

CTCTTCTAAA AGTTCACGGT TAGGGTTGCC CTTCGTGCAT AATCATATTT AACATCGCCA 8220 
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TCCACACCTC TACGCCAATC GAATATCACT TCTGTCTCTT TTGAAAGTGT CATACAATCT 834 0 

CTCCAATCTG AGCTTTATCT AATGCTTGGA TGATATCGCG TTCGATGTCT TCATAATTTT 8400 

5 

CAACACCTAG TGATAAGCGG ATTAAATACT CATCAATGCC ACX5TTTATCT TTTTCAGCAT 8460 

CTGGCATATC AACATGTGTT TGGGTGTAAG GGAAGGTCAC TAATGTTTCA GTACCTCCTA 8520 

AACTTTCTGC AAAAATGCAA ATGTCTAAAT TTTCTAATAA TTTAGCGACG CTATAGGCCT 8580 

10 

TGTTAAGTCT TAAACTAAGC ATGCCAGTTT GCCCGCTATA TAGTACTTCG TCAATTGCTT 8640 

GAAGTGACTG ACATTTTTTA GCAAGnTTC TAGCGTTTGA TTGCGCACXSC TCAATGCGTA 8700 

75 AATGCAAAGT TTTAAGTCCA CGTAACAACA AATAACTATC TATTGGTGAA AGTGTTGCGC 8760 

CAGTCATGTT GTGAAAATCA AACAACTGTT GCGCX3AGTGA TTCATCTTTG ACGGTTACX5A 8820 

CACCTGCTAG TACATCGTTA TGTCCGCCAA TATATTTCGT GGCTGAATGT AAGACTATAT 8880 

20 CAGCACCTTC TGCTAGTGGT GTTGAAAGAT AAGGTGTTAA AAAAGTATTG TCGATAATTG 8940 

ACAATAAGCC TITAGCTTTA CAAAGTTGAT AGTATGGCTT TACATCAATA GCAATCATTT 9000 

GTGGGTTAGA TATTGGTTCA ATGAATAATG CAACTGTTTT ATCAGTGATT TCTTTTTCAA 9060 

25 CTTGTTCATA ATCTGTAAAA TCAACGTACT TAAATTTGAT ATCGTATTGT TGCTCGTAAA 9X20 

ATTCAAATAA TCTAAATGTG CCACCATATA AATCGAATGA AACTAAAATT TCATCATGAG 9180 

GTTTAAATAG ATTACATATT AATTGAATGG CTGACATTCC ACTTGATGTA GCGAATGATG 924 0 

30 

CAATACCATG CTCAAGTTTG GCAAAACAGG TTTCAAATGT TGAGCGTGTA GGATTTTTAG 9300 

TACGTGTATA ATCAAAACCT GTCGATTGTC CTAGTTTTGG ATGCTTGTAG GCAGTAGATA 9360 

AATGGATTGG ATTCGCTATA GCACCGGTTG AATCATCGGT TAATGTGATT TGGGCTAACT 9420 

35 

GTGTATCCTT CATATTAAGA CCCTCCTATA AGAAAAAATA AAAAAAGCTT CC6TCCTTCG 9480 

TACCCGAATG AATCGGATAA AAAGGACGAA AGCTTATGTT TCGCGGTACC ACCTTTATTT 9540 

4^ GTTATTCCAT CGCTGAAATA ACCTTATTCA GTACOCATTA AAAGTAAATA TGCTTACTGA 9600 

ACAATTATCA CAATTAAAGT CAGTAAGTAA GGATATAGTA ATGTGCTATC CCATACTTAT 9660 

TAACAAAAAA TCGTGCGTAA AGAATCCAGT ACGCCATTTA ACATCAATGT TAATACTGTA 9720 

45 TCGCTATAAC GGGCGAACCC GTAGACACCT CATATTGGCA TCAACACTCC AAGGCCATTT 9780 

TCAAACACGC TTTCAAAATC TTCTCTCAGC TACTAAAGAC TCTCTGTATA AGCAGGGTGT 9840 

GTTTTACTTy CCTCTTTATT GTGTTTACGT TTCATTAAAC TGTTATAAGA TATTAATTAG 9900 

CTTACAGAGT AAAAAAAGAT TTGTCAACAA TTATTCAGAA AATTTTGATT TAAAAGTTAA 9960 

TTTGTTTGTG AAATTGTAAT TGGTATCTTG AAGTTGAAAA ATGAATTATT TTTTAAATAA 10020 
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TCAAATAAAA AGTGATGTGA GTGAATTGTC AAAAAGTGAA GATCAACGTA TTACTAAAAC 10140 

AAAAGATGAA CAAATTAAGC AAATAGATAT ATCGGATATC AAACCGAATC CGTATCAGCC 10200 

CCGAAAAACT TTCGATGAAA ATCATTTAAA TGATTTGGCA GATTCAATTA AGCAATATGG 10260 

AATTTTGCAA CCAATTGTGC TTAGAAAAAC AGTTCAAGGT TATTACATTG TAGTTGGTGA 10320 

AAGAAGGTTT AGAGCTTCGA AAATTGCTGG TCTAAAATAC GTATCAGCGA TTATCAAAGA 10380 

TTTAACAGAT GAAGATATGA TGGAACTGGC GGTCATCGAA AATTTACAAC GAGAAGACTT 10440 

AAATGCGATT GAAGAAGCTG AAAGTTATCA ACGTTTGATG ACAGATTTGA AAATTACACA 10500 

ACAAGAAGTA GCGAAACGAT TGAGTAAGTC GCGCCCGTAT ATAGCGAATA TGTTQAGGTT 10560 

ATTACATTTG CCGAAAAAGA TTOCTGACAT GGTAAAAGAT GGGCGACTGA CAAGTGCACA 10620 

TGGACGAACG TTATTGGCAA TTAAAGATGA ACAACAAATG CTTAGGTTAG CGAAACGGGT 10680 

2^ TGTTAAAGAA AAGTGGAGT6 TCAGATATTT AGAAAACCAT GTTAATGAAT TAAAAAATGT 10740 

TTCX3TCAAAG TCGGAAACAG ACAAAGTAGA TATAACTAAG CCTAAATTTA TAAAGCAGCA 10800 

AGAACGACAG TTGCGAGAAC AGTATGGTAC CAAAGTAGAT ATATCAATAA AAAAATCGGT 10860 

25 TGGTAAAATC TCATTTGAGT TTGATTCACA AGAAGATTTT GTGAGAATAA TTGAACAATT 10920 

AAATCGTAGG TATGGTAAAT AGTTACACAA TTTTATATAA TAACTCTTTG TGCT^GTGTA 10980 

AATAAATTGT AATCAGTGAC ATTTGATTCT AGAT 11014 
30 (2) I^FFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6022 base pairs 

(B) TYPE: nucleic acid 

« (C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

TCCCCTTATG GAATTTCACA TTCTAGTTTA CATAATATAT ATTATAGGAA GTTATATGTG 60 

TGTAACGCAA AAgGTACCCT ACATCATAAT CATTATCTAA TATCGTCACA TAACTTACTT 120 

ATGCTATAAT CATGGTATTA TATTGTTTGG AGTGATTTGA TGAGATTTGT CTTTGATATT 180 

GATGGTACGC TTTGTTTTGA CGGCCGATTA ATTGACCAGA CTATTATTGA TACATTGTTA 240 

CAATTACAAC ATGATGGTCA TGAACTTATA TTTGCATCAG CACGTCCGAT TCGTGATTTC 300 

TTGCCAGTTT TACCATCAGT ATTTCATCAG CACACATTAA TTGGCGCAAA TGGTGCTATG 360 

ATTTCACAGC AATCAAAGAT TTCTGTTATC AAACCAATTC ATACTGATAC ATATCATCAT 420 
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GCTGCACAAC TTGACGCTGn AGAACGCX5AT TTTTGAGCGT TTAGATCCAC ATAAGCTGGC 540 

CAGTTGTATT GATGTTGCAA ATATCGACAC GCCAATCAAG AkTATTTTAT TAAATATAGA 600 

CCCGGCACAA ATTACAACTA TATTAGACXSA GCTAGATAAA TACCATCAAG AATTGGAAAT 660 

GATTCACCAT TCAAATGAGT ATAACATTGA TATAACAGCG CAAAATATTA ACAAATATAC 720 

TGCATTACAA TATATATTTG ATGCAGATGT TAAATATATA GCATTTGGTA ATGACCACAA 780 

TGATATTGTC ATGTTACAAC ATGCTAGTAG TGGCTATATT ATAGGACCAT CAGAAGCATA 840 

CACACACGCA ATATTGAAAC TTGATAAAAT CAAACACATC AATAATAATG CACAAGCTAT 900 

TTGCAAAGTC TTAAAATCAT ATAAATAAAA ACACCCCTAT CAAATGATAA TCATTATCAA 960 

TCGATAGGGG CTATTTTAAT AAAATTOGTC CTCGAACATT TCTTCCTCTT CATCTAATCC 1020 

AAATAATTCT GCCATTTCTC CATGTTCAAT TAACATGTTT AAATATGCAT CGCX3GAGTTC 1080 

TTCTTCACTC ATATCATTAA TCATTTCTTT AAGACTATCA ATCCACATAT TTCTGCGTAA 1140 

TTGATAGTCT TCTTCAACTT CGTTTAACAT CATTATATGT TTATTTGCTG CTTCTGGACT 1200 

AGCTGTAAAG AGTAATGCAA TCATATGTTT ACATATCACT CGTCTTCCAT CAGCATGAGG 1260 

25 ACAATTACAT ATGGATTTTC TAGGATGTTC CATATCAATA TAACAACGAT ATACTTTGTT 1320 

GCCACTGCCC TTTACTTCAG CCTCATGCTG CGTTTCTGAA AATGATTTTA AGTTAATGAC 1380 

GCATTCACTT TGATAATAAT TAAAGCCTCT TTCTATAGAA CGAATACTTG CAATATCAAG 1440 

30 TAATCCCATT AATGaTACTC CTTTTTATTA TTATTTTTAA ATAAAGAaAA TAAAATAGAT 1500 

AAGTGTCTAG ATTAAAATAC TTGATTTATC TATATTTTAT AACAAGTCTA GAATTATCGC 1560 

ATTCTTAAAT AACTAATATG AAAATGcTTG CACTAATTCt TTTGTATAAG GGTGTCTATC 1620 

^ AACATTAAAT AATTCCtCTA TTGCAAAATC ATCGACTATC ATGCCATCCT TAAGAACGAT 1680 

AATTCTATTA ACTAAGCGTT GTAACACGGA TAAATCATGA GAAATAACGA TAAAATGATT 1740 

TAAGTTCGTA ATCGTTTGCG CTTTTAATAT ATTGATTACA TTTTGTTCAG CTATAACATC 1800 

TAAATTTGAA GTTATCTCAT CACATATTAA AACGCGAGGC TGTGCTAATA ACGAACGCAT 1860 

GACATTAAAT CTTTGTAATT GTCCGCCACT CACTTCGCTT GGTAATTTAG TCAATAATTG 1920 

CGCGTTTAAC TCAAAAGTAG ATAAATGTTG TAATAATAAT TGATCCTGAG CAGTATTATC 1980 

AGTTAGACCT CTGTAATAAT ATAACGCTTC TTTTAATGAG GTCTCAATCG TCCAATCAGG 2040 

GTTAAAGCTA GTTAAAGGGT GTTGGAAAAT CGGTAACACA GCATTGTCAC TTAAGTAAAT 2100 

CTCTCCTTTA ACAGGTTTAA ACAAGCCAAG AACCAATGAA GCGAGCGTAC TTTTACCACA 2160 

GCCACTTTCG CCTAAAATAC CAACATTTTC TCCATCAGGT ATAGTAATAT TGATATCTTG 2220 
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CCCTCTTTAA TTGTGTTCTA TATTTAATTA GACGTTCAGT ATACGGATGC AAATGCTCAT 2340 

ACTTGAAATG ATTAATATTA CCTCGTTCAA TGATTTGACC TTCTTTTAAA ACATAAATGT 2400 

5 ACTGACAATA TTTCAATACA TGACTTAAGT TATGTGTGAT AATAAATAAT GTTTGACCAT 2460 

GrrCTAATAC AATATGCTGT AATAAATCCA TCACTTGATT ACCOTTCAAA GCATCCAATG 2520 

ATGCAACTGG TTCGTCTGCA ATGATTAATT TAGGCTCCAA CATGAGAACG CTTGCTAT6T 2580 

ATACGCGTTC AAGTTGGCCC CCAGAAAGTT GGAAACTATA TTTATTTAAT ATATCTTTGC 2640 

TTTGTAAATT AACCCACGAC AAAGCCTTAT CAACTTTGGA CAAAGCCTCT TCTTTACTAC 2700 

CTTTATAATG CTTACGATAA ATCGCAGTTA ACTGTTTACC TAATTTAGTA TGGTCGTTAA 2760 

15 

AACTTTCTGC ATAATTTTGA GAAATATAGC CAATTGTATG ACCATAATAT TGACTCAATC 2B20 

TACTAACATT TTCCCCATCA AATTGGTACG AATCATACGT GCAGCTTAAA TCAAATGGTA 2880 

AATATTCAAG TAAAGCTTTA GCAATCAAAC TTTTTCCAGC GCCGCTCTCT CCAATCAAGG 2940 

20 

CATTAATCTG TTGACTAAAA ATTTTCAAAT CAATCCCTTT AATAAGAGAT TTCTCACTAG 3000 

TATTCTTTAT TGTTAAATTT TGTATATCAA TGAGACTCAT CATATTCACC CCGTTGTTTC 3060 

AGCAATCTAT CTCTTAGTGC ATCACCGGTT AAATTAAAAA TTAAAATAGT TATAGCAATG 3120 

25 

ACTGAAGCAG GTGCAATCAA CATAATTGGA TGAGACX3AAA TAAAATCACG ACCTTGTTGC 3180 

AACATAGCGC CCCaCTCTGG TGTTGGCGGT TGTGCACCTA ACCCAATAAA TGATAGTGAA 3240 

CTTATATATA GAATGATTTT ACC3GAAATCA ACGACCATCA AAACGATAAT AGCCGGTATA 3300 

ATTTTAGGTG TTAAATGACG TATTAATATT GTTCTTGTTG GTACATGAAA TAATTGTGCC 3360 

ATTTTTATAT AAGGCTTATT CATTTCGCTA TTAACTATAC TTCTAGTCAA CCTTGTGTAA 3420 

35 TTCATCCATT TTATTAATGT AATTGAGATA ACTAAATTCC ATAAAGATGG TTGAAAAAAA 3480 

CTTGSTAAAG CAATCATGAT GATAAATTCT GGAATACTTA GACCAACATC AATAAACCTT 3540 

AACACTAATC GTTCAATCCA CCCTTTTTTG TATCCGGCAA ATAGACCTAG TGTAACACCT 3600 

40 ATGACAACGA TAGCTATTAA TGTTAAAACA GTAACAAACA AT6TTGAACG TGCACCGATA 3660 

ATAATTCGGG TAAATAAATC TCTCCCATAA TCATCAGTTC CTAATAAATG CAACCAACTA 3720 

ATAGGTTCAA AAGTTTGTGA TAAATTGACT TTGGTTGCAT TTTCACTACT GACAAAGAAT 3780 

^ TGCAGTACAA TTACCACAAA AATAAATGCA ACGAATACAA AAAATATCAG GTTATTCTTT 3840 

GAAAATATTT TATGCATGAC GGTCACTACT TTCTGATATC AATGGTGTAT TGGTTTTGAT 3900 

TTTTGGATTT CCTAATTGTA AACGCTGCTT CGGATCAAGT AATAACGTTA ATAAATCAGC 3960 

SO 

AATCGTATTG ATAATAACAA CGAAGAAGCC AATAAATAAC ACGCATCCTT GAATAACAGG 4020 
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ATTTTCAATC ACTACAGTAC CACCTATTAG ACTGCCAAGT GAAATCCCTA GTAATGGGAT 4140 

AATCGGCAAA ATTGTTGGTT TTAGTAAATC ATGAATTAAA ATATAACXTTT CATTCATACC 4200 

GCGTAATCTT GATGCTTGTA CGATATTACT TTGCAATAAC ATCAATAAAT TAGAACGCAC 4260 

TAAACGAATG ATGTATGCAC ACATACCTAA AGATAGCGTG ATTACAGGTA ATATAAACTG 4320 

ACTTAGTATA ACGCTATCTA TATTCATTAA ATTTGTGACA ATAAATAATA AAATAATACC 43 BO 

GATAAAGAAC GCTGGTAAAC TAATCGATAG TGTTGAGATC ACTCTAATCA CTTTATCCGT 444 0 

CCACTTATGA AATCGTTTGG CTGCTATAAT GCCX3A6CGGT ATAGATATGC ATAACGACAC 4 500 

TACTAATGTT GAAAATGATA TGAGTAATGT TATGGGTGCA TAGTTGAATA ATATCTGTGT 4560 

TACCGGTTCT TTTGATTCAA AACTTTTTCC TAAATTAAAA TGTAATAAAT GATTCATCCA 4620 

ATGCCACCAC TGTACCAATA AAGAATCATT TAATCCCAAT TTATCTTTGG TTGCATTTAT 46B0 

TTGTTCCGTC GACACTTGTG CTACATCAAG ATGTAATATT TTATCAACAG GATTGCCTGG 4740 

TGATAATTTC ATTAAAATGA ATGTAAGTGT AGAAATAACA AATAAAACAA CTATCATTTG 4 BOO 

CATCAGTCTA TACAACATAG ACTTTATTAT GAACATAATA GTCCCCCTCC TT6TGTAAGT 4660 

2s TACTAACACT TTCTTTTTAC ATGAGAATGG CGCATGTATA TGCAACTTAC ATATTAAGAA 4920 

CTAACGTTCA TTATAGTATT ATCCATAAAG AAATTGAAGT ATATTTAATT TTTTAACAAA 4980 

ATCATTATAA AATATAATAT TTTGAATCAA GTCAACCATG TAAAATATAA AAAAGTCAAA 5040 

30 ACAAAAACAA CTATAGCACT GTATTCCATC TCTTTCX3AAA TAATTGTTAC TGCAGTGTAA 5100 

CTTAAAAGTC GATGATTTTG TGCATATAGT TGTCGAATAT TATTTTTTAT CTTTACGGCG 5160 

AAGTTCAGCG CCCTCATAGC CGTATTTTTC AATTTGCTTT TCTAATTTAC GOGCTTTTCT 5220 

35 TTCTTTACGC CAATTTCTAG TAAAATACCA TAATAGAAAA CTAATTAATA AACTCATAAT 5280 

CGCTAAAAAT GCAGCGTATC CTAATAATGG TTGATATTTT ATATCTTGAA AATTTGGAAT 534 0 

AAAAAATGCA AGCACACCTA ATATAACAAA TGTAATTACT GCAGATACAA ACCATTTATT 5400 

TAAAACTAAG CAACAGAATA TTGTTAATAA AATCATTATT AATGTTGTGA TCCATAAATA 5460 

ATTAGGCATA TCGAATAATG TCATATTCAT TCTCCTTTTA TTTCATTACT TTCCTTGTAT 5520 

ACATTTTATT ATAAATTTTT AAAAACTTAA ACAATAGCAG TCAGTTTCAA GCAATATTCT 5580 

ATCTACTAAT AGAAAAATCA TTGTTCCTTG CGACATGGAA ATCGTAACAT TATCGTTTAG 5640 

GAGACAAAAT TATGTATAAT GAATGTATTA TACCAAAGGA GTGATTATAT GTCTCAAGGT 5700 

TTACCTTTAA GAGAAGATGT TCCTGTTTCA GAAACATGGG ATTTAGTAGA CTTATTTAAA 5760 

GATGATCAAC AATATTATGA AAGTATTGAC GCTCTAGTAC AnCAAGCAAA TCAATTTCAT 5820 
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GAAAATATTT TAATTGCCTT AGATCGCTTA AGTAATTATG CAGAACTACG TTTAAGTGTA 5940 
GATACTAGTA ATATCGAGGC ACAAOTATTG AGCGCTAAAT TATCTACTAC ATACGGTAAA 6000 
ATTGTTAAGC CAATTATCCT TT 6022 
(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

CCATCAATAA TGTATACATG ATTGGCATCA TATTCCCCTT TAATTAGAGA GCTACGTACA 60 

GTTTGTyTTA TTAAAGTAGA ACTAATAAAT AACCATCTCT TATGTGCACA AACACTTCCC 120 

GCAACAATTG ATTCAGTTTT ACCAACCCGT GGCATACCTC TAATGCCAAT CAACTTATGA 180 

CCTTCTTCTT TGAACAATTC AGCTAAAAAG TCTACTAACA AGCCTAAATC TTCACGCTCA 240 

AATCGAAAGG TTTTCTTATC TTTTGCATCT TGCTCAATAT ATCTTCCATG TCTTACTGCA 300 

AGACGGTCTC TTAATTCTGG TTTTTTAAGC TTTGTTATTT CAATTTCATT TATACCACGA 360 

GCTATTTGCT CAAAACGTTC AACTTTTTCA AGATTGTCTG TTTTAATTAA AAGGCCTCGT 420 

30 TTACCTTGAT CAACACCATT AATTGTAACA ATACTTATAC CTAACATACC TAATAA 476 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3633 base pairs 
3S (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

AGAAATACAA CGAAGCATAT AAATATAACC GATCTTTTTT CTAATTGAAT ATTAAGTAAG 60 

TGTATGTACT TTCTGGAAGT AGCACCTAGT rGGATTGTtC CTCCTACAAC AGGCCAAAAA 120 

TTTTTATTTT TAACTGGCTT AACAGTGTTC AGTTTTTCAT ACTCTTCTCT ACTAATTTTG 180 

GCGCACCITT TTGGAATGAA CCAATTAATA AATGGAAAAA AGTATACAAG CCAAGTTCTT 240 

ATTACATCGA CCATTAAATA CTCATCATCA TACTTAATAA CTCTGTATTT CGGATTTTTA 300 

TTGATAATTT OGGTTTCACA AAGCAATAAT TATCACTTCC TATTAATAAC AAATTCACAC 360 
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TTATATGACC 
AAGAAAAATA 
AAACTAATCA 
AGTACAAAGC 
TCATTATAGA 
AATAATAAAA 
TATATTTTAT 
GCTTTaTTTT 
ACATATTTTT 
ACTAAATCAA 
TTTACAACTC 
ATACACTTTC 
ATTACTTAAC 
CATAAACCTT 
TAATCCACAT 
ATAATATAGA 
AAGTAGGCAC 
TAAGCTTTTT 
AAGATATTAA 
TAACTACAAA 
TATCWVCATT 
ACCAATTAAT 
ACTCATCATT 
AAAGCAAAAC 
CTTCACTAAC 
CTATACTATT 
TATTTTGATA 
AAATCACCCA 
CTGTCCCTGC 



TTAAATATAT 
GCATTGTCAT 
rrACATCTAA 
AAAGATTCTT 
TAGATAACTT 
CCGAACAAAT 
CTCTAAACAT 
TAACTGGTTT 
TTGGTATTAA 
TCATCAGATA 
TAACCTCGCA 
AGATACTTTA 
TTACGAACTA 
TTTATTACTA 
AATAAATACC 
ACAACCTCCT 
TAATATAACC 
ATTTAAGTAA 
AATCTTATTA 
TAAAAACGAA 
TAAATTTTCA 
AAAAGGAAAA 
ATATTGAACG 
TCACCACGCC 
AGCATTCCCA 
ATCGATTTTA 
AATTAAATCT 
TCCTACTGGA 
AGATCCAGCT 



AACATGAATC 
AATAACCCAA 
GAACATGATT 
GAATGATGGA 
TACTTTCTGA 
GATAATAACG 
AGTGCCAAAT 
GACAATATTT 
CCAATTAATA 
GTCGTTTTTA 
AAGCAATATC 
TAAGTGTTTT 
CAATCTAAGT 
ATTGAGCCCA 
AGTAGATTTT 
AATAATAGAT 
CTATTTTCAT 
ATGTAAAATG 
TCTAATTGAA 
CTACCAGTAA 
AGTTCCTTCT 
AAGTATACAA 
ACTTTATATC 
CATTTCATTG 
ACACTATCCA 
TGCCCTACCC 
GTTCCTAATG 
TTACTTCCTA 
GCAAGCGTgC 



TTTTTGTCTA 
GCAATAAATA 
gATAATCCAC 
AAAATCATAA 
TTTAAATATA 
CAATTTTTTT 
AAAA5TATGC 
AAATTATCAA 
AACGGAAAGA 
TATTTAATAA 
TCCACTTCCG 
GTATTTTAGT 
TTAGTAATTT 
TGCTTATTAG 
GAGGTTTTAT 
ATGTGAAAAC 
TATCTAGATT 
CTGCAATACC 
CTTCAAACGT 
CTGGCCAGAA 
CACTAAGTTT 
GCCAAGTGCT 
TCGQATTTTT 
GATTTATATG 
TGGATTTTTC 
AGTCTACTTT 
CAAATACTGT 
AAACAAAAGT 
ATACCATTAT 



TTATTGAAGA 
CTATAATATT 
CACAGAAAAA 
TTTTTCCATT 
TATAAAACAC 
CTAAATGAGA 
TACCTATAGC 
AATCTTCTCT 
ACAAAACTAA 
TTCTATATCT 
TCTCGTTGGT 
AACATACTAT 
CTATTGCTTT 
AAAGAAAAAA 
AGTCATTAGC 
TATAAAACTT 
ATCATCATAT 
TATAAATCCT 
ATGTACATAT 
AATATTATTT 
TGCATACCTT 
TACTAAATCA 
ATTAATAACC 
ATTGCTAATA 
TGTAGTTTTT 
ATCTTTTAAT 
ACTCATAGCC 
CGCTAATCCA 
GCGACAACGC 



CATATTTATA 
TTGGATAGAT 
ATAAGAAAAT 
GCTACTCCGA 
TAGAATACTT 
ATCAGGTATA 
TGGCCATAAA 
GCTGATTTGG 
CCAGGTGCTT 
GGGATTTTTG 
TTTATATCTA 
TTTCCTGTTT 
TTAAGTTTGG 
ATTGTAATAA 
CATATTAAAA 
CCATCTTTAA 
ATCTTTAGTT 
ATAAAACATA 
TTCCGTAAAA 
TTATTTTGTT 
TTGGGAATGA 
ATTAACAAAT 
TTAATATTAA 
ATATTTTTAG 
TTAACAACAT 
CCAAAAATAT 
AAACCTGCTA 
GCTCCAACTG 
CTCTCCAAAT 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



574 



10 



IS 



EP 0 786 519 A2 

CCTTTACCTA GGTATTTTCC GCCTTTTGCA AATTTACTAC CATTTTCTAT AAACACATTA 2280 

CCTGATGTAC GTTTGACTTC CACAAATGAA TTTGGACCTG CTGGGCCTTT CACTCCACCT 2340 

GCTGTATTGa TAAATACACC GAATTTACTT GcATTTATAC CGTCTTGCTC TAAAAGTGTT 2400 

GACGTAATAT CTAATCCTAT ATCTCTTTTA ATACTGTCTT TATTGTCATT TATATATTTC 2460 

AATATACTTT TCGGGATATC GTCTTCTGGA TGTTCTTTGG CATATGCCTT TATAACAGCA 2520 

AAGTCTGCTT TATTTAAAGT TTCTTTCTCT GCTTTATGTT CAATTTTCCC CATAGCAACT 2580 

TTCAAATATT TTTCATGACT TGCTTTGGCC CAATCAAGTT CTTTACCTGA AGGAATATTA 2640 

AATTGATTTG TTGAAAAGTT CCAAAAATTC TGCGCTTGGG TAAGTCCTTG TTGGACAATT 2700 

TTTTGAAATT CTTCAACTTC TTTAAATATT TCTGGTGATT TTTGATTAAA CTCACGCAAT 2760 

TTGCGTAGCT TCTCTTCTAA TTCATGTTTT TGTTGACCTA ATGTTCGTAT TATTTGTTGG 2820 

20 TTCGATGAAA TGGCTTGCTG ATTATCGGAA GCATGCTTTT TCAAATTGTT ATTCAAATTT 2880 

TCATATCGCG TAATTTGTTG ACTTAATGAT CTGATATCTT CTTCAAGCTC TGATTCTTTT 2940 

AAAGATATGC TATCAACCTC ACTCGTATAA CGTGACACAA AATTaTCGCA AGCTTGCTTC 3000 

GTTAAATCAC TCAATGTTTT CATACTTGTT GATAATGGAA TTAACACCGT ACTAAAAAAT 3060 

TGCTTAGCTG ACGTATACGC TTTCCCTTTA AGCGCATCAT CATTAATAAA TTGAGTAATT 3120 

GCTTTTTCCA ACGCATCATA ATTTGAATTC ATTGTTTGAC TCAAATTCCC CACACTTGAA 3180 

GCTTGGTTTC GAGATCTGTC TAAATACATG TCAATACTCA TCGGCATGCT CCTTTTTCAA 3240 

AAATATATGA TTTTCAAACT ATTTAAAATC AAATGCTTTT TACATCTACA AAGTTGTAAA 3300 

35 ATTTTAAAAC TCGGCGATGA TTATTTCTTA TGTAAAGGAG TCTAGATGCA GGTAAATTGA 3360 

6ATAACATGT CGCCTTTTTT CTTATTTTAG CATATGGATA TAATGGTGTC TTTGTATATT 3420 

CGCAATTAAT CAATAAAAAT TATCTTTCAA TATTTTAATT TTATTGCGAC AACATCCTTA 3480 

ACATTAAATA TATTAATATC TCAAAATATA TTCACTATTA AAATATGTCA TCAGTTGTTA 3540 

AAAGTATTTC CTCATCATGC GAAATATCAA AACGTATCTA AAATACGAAT AAGTTTATAC 3600 

AATCACACAA CATCATCATT CAAAATTTTA TTG 3633 

45 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2365 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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GATATACATT GAATGTGTTA TATGATCGTT ATCAGTTACC ACTTTTTATT GTGGAAAATG 1B60 

GTTTTGGTGC AGTTGATGAA GTGGTAGATG GACATATTCa TGATGATTAT CGCATTGAAT 1920 

ATTTAAAAGC ACATATTACA GCAGCGATAG AAGCAGTTGA TCAAGATGGT GTAGATTTAA 1980 

TCGGTTATAC ACCGTGGGGA ATCATTGATA TTGTTTCATT TACAACCGGT GAAATGAAGA 2040 

AAC6CTATGG TTTAATATAT GTTGATCGAG ATAATGATGG TCATGGCACG ATGGAACGCT 2100 

TGAAAAAAGA TTCGTTCTAT TGGTATCAAC AAGTGATAGC ATCAAATGGA GATAAATTAT 2160 

AAAGGTATAT TATAAGTATT TTAGGGTTAG AGCCCGAGAC ATAAATTAAT ATAGTAGGAC 2220 

,5 CTACAGTGTT ATAATGGCGG gCCCCCAACA CAAAGAATTT CGAAAAGAAA TTCtAcAGGT 2280 

aATGCaAGtT GGCGGGGcCC AACACAGAGA AATTCGAAAA GAAATTCTAc AGGTAATGCA 2340 

AGTTGGGGAA GGACAGAAAT AAATT 2365 
20 (2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11050 base pairs 

(B) TYPE: nucleic acid 
2s (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

30 

CTGCGATACG ATTTGTTGAA AGTGGGGAAA ACAAAAAAGT TATCATTACC AATTTAGAGC 60 

AGGCATACGA AGCTTTGATT GGTAATAAAG GTACACACAT TCACATGTAG CACTTTATCA 120 

35 CGCGACAAAA CATTAAATAT GTTTCTCCGT TGATTCAAAT GAAAAAGTTG TCTGCTGACA 180 

CTTTGCAAGG TTTGAAGGAG TTTAACTTAT GACAGAAAAC TTTATTTTGG GTAGAAATAA 240 

TAAATTAGAA CATGAACTAA AGGCATTAGC AGATTACATT AATATACCAT ATAGTATATT 300 

ACAACCATAT CAAAGTGAAT GTTTTGTCAG ACATTATACG AAAGGCCAAG TTATTTATTT 360 

TTCGCCACAA GAAAGTAGCA ATATTTACTT TTTAATTGAA GGTAACATTA TTAGAGAACA 4 20 

TTACAATCAA AATGGAGATG TATATCGTTA TTTTAATAAA GAGCAAGTAT TATTTCCAAT 480 

45 

CAGTAACTTA TTTCATCCGA AAGAGGTTAA CGAATTGTGT ACAGCATTAA CCGATTGTAC 540 

AGTTCTTGGA TTGCCTAGAG AATTGATGGC CTTTTTGTGC AAAGCTAATG ATGATATATT 600 

TTTGACACTT TTTGCATTAA TAAATQATAA TGAGCAGCAA CACATGAACT ATAACATGGC 660 

50 

ATTAACAAGT AAATTTGCTA AAGATCGAAT TATCAAATTG ATATGCCATC TATGTCAGAC 720 

AGTAGGATAC GATCAAGATG AATTTTATGA AATCAAACAG TTTTTAACTA TTCAACtCAT 780 
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TGAAAAACTT GTTGTTAAAG ATCATAAAAA TTGGTTAGTA AGCAAACATT TATTCAATGA 900 
TOTATGTGTT TAATATACAA TGTAAAATGA ATAAGTTGAA CATCAGGTCT AACGTACATT 960 

TATACGTTAG GCCTTTTTTG CTAGCATGAT GAATAATTTA AAATGTTAGT TAAATTTGAT 1020 

TGTTGAAATT ACAGTAAAAT TTAAGGTGAT GAAAAATTTA GAACTTCTAA GTTTTTGAAA 1080 

AGTAAAAAAT TTGTAATAGT GTAAAAATAG TATATTGATT TTTGCTAGTT AACAGAaAAT 1140 

TTTAAGTTAT ATAAATAGGA AGAAAACAAA TTTTACGTAA TTTTTTTCGA AAAGCAATTG 1200 

ATATAATTCT TATTTCATTA TACAATTTAG ACTAATCTAG AAATTGAAAT GGAGTAATAT 1260 

TTTTGAAAAA AAGAATTGAT TATTTGTC6A ATAAGCAGAA TAAGTATTCG ATTAGACGTT 1320 

TTACAGTAGG TACCACATCA GTAATAGTAG GGGCAACTAT ACTATTTGGG ATAGGCAATC 1380 

ATCAAGCACA AGCTTCAGAA CAATCGAACG ATACAACXiCA ATCTTCGAAA AATAATGCAA 1440 

20 GTGCAGATTC CGAAAAAAAC AATATGATAG AAACACCTCA ATTAAATACA ACGGCTAATG 1500 

ATACATCTGA TATTAGTGCA AACACAAACA GTGCGAATGT AGATAGCACA ACAAAACCAA 1560 

TGTCTACACA AACGAGCAAT ACCACTACAA CAGAGCCAGC TTCAACAAAT GAAACACCTC 1620 

25 AACCGACGGC AATTAAAAAT CAAGCAACTG CTGCAAAAAT GCAAGATCAA ACTGTTCCTC 1680 

AAGAAGCAAA TTCTCAAGTA GATAATAAAA CAACGAATGA TGCTAATAGC ATAGCAACAA 1740 

ACAGTGAGCT TAAAAATTCT CAAACATTAG ATTTACCACA ATCATCACCA CAAACGATTT 1800 

CCAATGCGCA AGGAACTAGT AAACCAAGTG TTAGAACGAG AGCTGTACGT AGTTTAGCTG 1860 

TTGCTGAACC GGTAGTAAAT GCTGCTGATG CTAAAGGTAC AAATGTAAAT GATAAAGTTA 1920 

CGGCAAGTAA TTTCAAGTTA GAAAAGACTA CATTTGACCC TAATCAAAGT GGTAACACAT 1980 

TTATGGCGGC AAATTTTACA GTGACAGATA AAGTGAAATC AGGGGATTAT TTTACAGCGA 2040 

aGTTACCAGA TAGTTTAACT GGTAATGGAG ACGTGGATTA TTCTAATTCA AATAATACGA 2100 

40 TGCCAATTGC AGACATTAAA AGTACGAATG GCGATGTTGT AGCTAAAGCA ACATATGATA 2160 

TCTTGACTAA GACGTATACA TTTGTCTTTA CAGATTATGT AAATAATAAA GAAAATATTA 2220 

ACGGACAATT TTCATTACCT TTATTTACAG ACCGAGCAAA GGCACCTAAA TCAGGAACAT 2280 

ATGATGCGAA TATTAATATT GCGGATGAAA TGTTTAATAA TAAAATTACT TATAACTATA 2340 

GTTCGCCAAT TGCAGGAATT GATAAACCAA ATGGCGCGAA CATTTCTTCT CAAATTATTG 24 00 

GTGTAGATAC AGCTTCAGGT CAAAACACAT ACAAGCAAAC AGTATTTGTT AACCCTAAGC 2460 

AACGAGTTTT AGGTAATACG TGGGTGTATA TTAAAGGCTA CCAAGATAAA ATCGAAGAAA 2520 

GTAGCGGTAA AGTAAGTGCT ACAGATACAA AACTGAGAAT TTTTGAAGTG AATGATACAT 2580 
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ACCAATTTAA AAATAGAATC TATTATGAGC ATCCAAATGT AGCTAGTATT AAATTTGGTG 2700 

ATATTACTAA AACATATGTA GTATTAGTAG AAGGGCATTA CGACAATACA GGTAAGAACT 2760 

TAAAAACTCA GGTTATTCAA GAAAATGTTG ATCCTGTAAC AAATAGAGAC TACAGTATTT 2820 

TCGGTTGGAA TAATGAGAAT GTTGTACGTT ATGGTG6TGG AAGTGCTGAT GGTGATTCAG 2880 

CAGTAAATCC GAAAGACCCA ACTCCAGGGC CGCCGGTTGA CCCAGAACCA AGTCCAGACC 2940 

CAGAACCAGA ACCAACGCCA GATCCAGAAC CAAGTCCAGA CCCAGAACCG GAACCAAGCC 3000 

CAGACCCGGA TCCGGATTCG GATTCAGACA GTGACTCAGG CTCAGACAGC GACTCAGGTT 3060 

CAGATAGCGA CTCAGAATCA GATAGCGATT CGGATTCAGA CAGTGATTCA GATTCAGACA 3120 

GCGACTCAGA ATCAGATAGC GACTCAGAAT CAGATAGTGA GTCAGATTCA GACAGTGACT 3180 

CGGACTCAGA CAGTGATTCA GACTCAGATA GCGATTCAGA CTCAGATAGC GATTCAGACT 3240 

20 CAGACAGCGA TTCAGATTCA GACAGCGACT CAGATTCAQA CAGCGACTCA GACTCAGATA 3300 

GCGACTCAGA CTCAGACAGC GACTCAGATT CAGATAGCGA TTCAGACTCA GACAGCGACT 3360 

CAGACTCAGA CAGCGACTCA GACTCAGATA GCGACTCAGA TTCAGATAGC GATTCAGACT 3420 

25 CAGAPAGCGA CTCAGATTCA GATAGCGATT CGGACTCAGA CAGCGATTCA GATTCAGACA 3480 

GCGACTCAGA CTCGGATAGC GATTCAGATT CAGATAGCGA TTCGGATTCA GACAGTGATT 3 540 

CAGATTCAGA CAGCGACTCA GACTCGGATA GCGACTCAGA CTCAGACAGC GATTCAGACT 3 600 

CAGATAGCGA CTCAGACTCG GATAGCGACT CGGATTCAGA TAGCGACTCA GACTCAGATA 3660 

GTGACTCCGA TTCAAGAGTT ACACCACCAA ATAATGAACA GAAAGCACCA TCAAATCCTA 3720 

AAGGTGAAGT AAACCATTCT AATAAGGTAT CAAAACAACA CAAAACTGAT GCTTTACCAG 3780 

AAACAGGAGA TAAGAGCGAA AACACAAATG CAACTTTATT TGGTGCAATG ATGGCATTAT 3840 
TAGGATCATT ACTATTGTTT AGAAAACGCA AGCAAGATCA TAAAGAAAAA GCGTAAATAC ' 3900 

TTTTTTAGGC CQAATACATT TGTATTCGGT TTTTTTGTTG AAAATGATTT TAAAGTGAAT 3960 

TGATTAAGCG TAAAATGTTG ATAAAGTAGA ATTAGAAAGG GGTCATGACG TATGGCTTAT 4020 

ATTTCATTAA ACTATCATTC ACCAACAATT GGTATGCATC AAAATTTGAC AGTCATTTTA 4080 

^ CCGGAAGATC AAAGCTTCTT TAATAGCGAT ACAACTGTTA AACCATTAAA AACTTTAATG 4140 

TTGTTACATG GATTATCAAG TGATGAAACG ACATATATGA GATATACAAG CATAGAAAGG 4200 

TATGCGAATG AACACAAATT AGCTGTGATT ATGCCCAATG TGGATCATAG CGCATATGCT 4260 

SO 

AACATGGCAT ATGGTCATAG CTATTATGAT TATATTTTGG AAGTGTATGA TTATGTTCAT 4320 

CAAATATTTC CACTTTCCAA AAAGCGTGAT GACAATTTTA TAGCAGOTCA CTCTATGGGA 4380 
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TTATCTGCTG TGTTTGAAGC GCAAAATTTA ATGGATCTAG AGTGGAATGA TTTTTCAAAA 450O 

GAGGCCATAA TTGGCAATCT TTCAAGTGTT AAAGGAACTG AACATGATCC GTATTACTTG 4560 

CTAGACAAAG CTGTAGCTGA AGATAAACAA ATTCCAAAAT TGCTCATTAT GTGTGGTAAA 4620 

CAAGACTTTT TATATCAAGA CAACTTAGAT TTTATCGATT ATTTATCACG CATAAATGTT 4680 

CCTTATCAAT TTGAAGATGG ACCAGGAGAT CATGATTATG CATATTGGGA TCAAGCGATT 4740 

AAGCGTGCTA TAACATGGAT GGTGAATGAT TAATTATTTC TTGGAAAATA TGTGGCTGCA 4800 

TTAAATACAC AGAGTGAGAG ATACAAACTA TTTACOCACG ACTAACATTT CTAAGTGTTT 4 860 

AAATTATTTT TGTATTAATA TGATTGGCGC AATTTGCTGA TACACAAAAA TGTTTCTCGT 4920 

GAAACTTAGA TTTAGCTTAT AGTTTTATCA TCATTTGTAT GACTTACATT ATAAATTTTA 4980 

TTATAATGAG GTTAACX5CTT TGAAAGGAGT CATCATCATG TCGACCAATA AAAACGATTA 5040 

20 TGAGCATATG TTGTTTTATT TTGCATATAA AACCTTTATT ACTACCGCTG ATCAAATTAT 5100 

AGAGAAGTAT GGTAT6AGTC GTCAGCATCA TCGTTTTTTG TTTTTTATCA ATAAATTACC 5160 

TGGTATTACT ATTAAATCAT TACTAGAAAT ATTAGAAATT TCTAAmCAAG GATCACATGC 5220 

AACACTTCAA AAATTAAAAG AGCAAGGTCT CATTATTGAA AAAGTTTTAG AGACTGATCG 5280 

ACGTGTCAAA AAATTATATT CGACGGATAA AGGCGATCAA CTCATTGCTG AATTGAACAA 5340 

GGCGCAAGAT GAATTATTGC AAAATATATA TCAACAAGTC GGTTCGGATT GGTATGATGT 5400 

GATGGAAGCA TTGGCTAAAG GgCGACCTGG cTTTGATTTT ATTAAGCATT TGAAAGATGA 5460 

AAAAGAAAGC TAGCATCAGA AATGTTAAAA ATCTTCGCAT TCTTAAATTT AAAAAATATG 5520 

TCAAAAAGTG TATAATAAAA ACATATAATT TAATTGAACT CAGTTTCAAC ACATCTTAGA 5580 

AAGGAGTTTG AATGATGAAA AAATTAGCAG TTATTTTAAC ATTAGTTGGC GGTTTATACT 5640 

TCGCATTTAA AAAATACCAA GAACGTGTTA ACCAAGCACC TAACATTGAG TACTAAATTA 5700 

40 AACCATAAAA AATTCCCGAA CACCTTGTTA TAGTGCTCXSG GAATTTTTTT ATGCTTTACT 5760 

TGAATATATC AAATATTATT TTTGOGCTTT CTGTATTTTC GATATTACCA CTAAATGATT 5820 

CTGATCTAGG TCCGTAAGCG TAgGTATTAA CATCCTCGCC TGTATGTCCA TCGGAAGTCC 5880 

ACCCTGTATA AGATTTATCA TTTACTGGCT TCTGAATAGC GTGTTGTAGG GCTTTTGTTT 594 0 

GCGTTTCTAC TTCTGCGGAT TTTTCGTCTT TTTCTTTTTT AAGTAGTCTT TTTAGCTTTT 6 00 0 

TATTCTCTTT TTTAACCTTT TTCATATCAT CTTGTGAAAA TTCAAATCCA TAACCTTCAT 6060 

TAATAACTTT TTCAGGGTCT TCACCTTTAG CCATTTTTTC TGTCATATAT GATCCAGAGT 6120 

GTTTCATAGA TTTAATCGGT TGAGGATTCC ATTCGTATCC TTTATCTTTA CCAATTGTTA 6180 

55 



25 



30 



35 



45 



SO 



580 




581 



10 



15 



EP 0 786 519 A2 

ACAAAACATA CAGCTATCTT TGACTGAATT ACAAATATTA AAGTTATTAT TTCAAAATGA 8100 

AGaTAAATAT GTAAGTAGrA CTGCTTTAAT TGaAAAATGT TGGGaATCAG AAAACtTCAT 8160 

AGATGATAAC ACATTAGCTG TTAACATGAC GCGCCT6CTG AAAAAATTAA ATACTATTGG 8220 

CGTTAATGAT TTTATCATTA CAAAGAAAAA TGTCGGATAT AAAGTATAGG GTGAATGCAA 8280 

TGACCTTTCT TAAAAGTATT ACTCAGGAAA TAGCAATAGT CATAGTTATT TTTGCTTTGT 8340 

TTGGCTTAAT GTTTTACCTG TATCATTTGC CATTAGAAGC ATATTTACTA GCACTTGGCG 8400 

TTATTTTATT ATTATTACTC ATATTCATAG OTATTAAATA TTTAAGTTTT GTAAAAACTA 8460 

TAAGCCAACA ACAACAAATT GAAAACTTAG AAAATGCGTT GTATCAGCTT AAAAATGAAC 0520 

AAATTGAATA TAAAAATGAT GTAGAGAGCT ACTTTTTAAC ATGGGTACAT CAAATGAAAA 8580 

CACCCATTAC TGCAGCACAA CTGTTACTTG AAAGAGATGA GCCTAATGTT GTTAATCGTG 8640 

20 TTCGTCAAGA GGTTATTCAA ATTGaTAACT ATACAAGTTT AGCACTTAGT TATTTAAAGT 8700 

TATTAAATGA AACTTCTGaT ATTTCTGTCA CTAAAATTTC GATTAATAAT ATCATTCGCC 8760 

CAATTATTAT GAAATATTCA ATACAGTTTA TTGATCAAAA AACAAAAATC CATTATGAAC 8820 

CTTGTCATCA CGAAGTATTA ACTGACGTTA GATGGACCTC TTTAATGATA GAACAATTAA 8880 

TAAATAATGC ACTTAAGTAT GCGAGAGGTA AAGATATATG GATTGAATTT GATGAGCAAT 8940 

CCAATCAATT ACACGTAAAA GATAATGGTA TCGGTATTAG TGAAGCGrAC TTGCCTAAAA 9000 

TATTTGATAA GGGCTATTCA GGTTATAATG GCCAGCGCCA AAGTAACTCA AGTGGGaTTG 9060 

GTTTATTTAT CGTAAAACAA ATTTCAACAC ACACAAACCA TCCTGTTTCA GTCGTATCTA 9120 

AACAAAATGA GGGTACAACA TTTACGATTC AATTTCCAGA TGAATAAAAA CTTTCAATAT 9180 

TGTAAGTATA CTAGTAACAT TTTTTTACTA ATTTAAATGT TATTAGTATT mTi ' G ' iTiT 9240 

AATi^TAGAAC TAACAAAGAA ATGAGGTGCA TGCCATGTTG CTAGAAGTGn AACATGTAAA 9300 

40 AAAGGTTTAT GGTAAAGGTT TGAATGCTAC GACAGCACTT AATCAAATGA ATTTATCAGT 9360 

TGGAGCTGGT GaATTTGTTG CaATTATGGG TGA6TCTGGG tCAGGGAAGT CTACACTACT 9420 

AAATTTAATT GCtTCTTTTG ATGGACTAAC TGAAGGTGAC ATTATTGTGG ATGGCGCACA 94 80 

TTTAAATAAT ATGAAAAATA AAAGTAAAGC ATTGTATCGT CaACAAATGG TAGGTTTTGT 9540 

TTTTCAAGAT TTTAATCTTT TACCAACAAT GACGAATAAA GAAAATATAA TGATGCCATT 9600 

AATTTTAGCT GGTGCTAAAC GAAAAGATAT AGAACAAAGG GTACATCAGT TGGCAGTACA 9660 

ATTACATTTA GAGGGATTCT TAAACAAGTA TCCTTCTGAA ATCTCTGGGG GTCAGAAGCA 9720 

ACGCATTGCC ATTGCACGTG CATTAGTTAC TAAGCCGACX3 ATTTTACTAG CCGATGAACC 9780 
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TCAATTGGAA C7WSACAATTT TAATGGTAAC TCATTCAAAT ATCGATGCGT CTTATGCAGA 990 0 

GCGAGTCATT TTTATTAAAG ATGGGCGTCT ATATCATGAA ATATATCXTPG GTGAAGAAAG 9960 

TCAATTAGCT TTTCAACAAC 6AATAACAGA TAGCTTAGCA CTTGTGAATG GAGGAAGTGT 10020 

CAATATATGA AGTTAAGATT GTTATGnACA TAGTGCGACG TCAATTTATT ACGCAGCGAC 10080 

TTGTAATCAT TCCATTCATT TTAGCGGTAA GTGTACTATT CATGATTGAA TATACGCTTG 1014 0 

TGTCAATTGG GTTAAATAGC TACATAAAAC AGAAGAATGA CTTCCTAGTA CCATTTATTA 10200 

TCATAGCTAA TTTTTTTATG GCGCTTTTAA C T TTTATTTT TATTTTCTAT GCAAATCACT 10260 

TTATGATGTC ACAAAGACGA ATU^GTTTA GCATTTTTAT GACATTGGGC ATGACCAAGA 10320 

AAAGTATGCX5 TTTAATTGTA GTGATGGAAA CTATCTTACA ATTTGTGATA ATTTCAGTCG 10380 

TTAGTATTGC CGGCGGATAC TTACTTGGTG CGATATTTTT CTTGTTTATA CAGAAAATAA 10440 

TGGGCAGTGA AGTTGCGACG TTAAGGTATT ATCCATTTGA CTCTGTAGCG ATGTTTATTA 10500 

CTTTGATTAT CATTGCTGTA TTAATGGGCA TGCTACTTAT ATTCAACTTG TTTAGTATTA 10560 

ATTTTCAACG GCCGATAACT TATCAACATC GTTCCGATTC TAGTGTCATA TCACGATGGT 10620 

TGCGTTACGT TTTAATTGTT ATAGGAAGCG CAnACTATAT TTAGGTTACT TTATTGCATT 10680 

ACAACAAGAT ACGACGTTTG GTGCCTTTTT TAAAATATGG ATTGTCATAG GATTAGTTAT 10740 

TATCGGTACT TATGCATTTT TTGTAGGTAT AAGTGA3\ATA ATTATTAGTA TATTGCAGCA 10800 

GGTATCAAAA GTTTACTATC ATCCACGGTA TTTTTTTGTG GTAGTTGGGA TGCGTGTACG 10860 

TCTTAAAATG AATGCAGTCA GTCTTGCAAC AATCACTTTG CTGTGTACAT TTTTGATTGT 10920 

AACGCTCACA ATGACATTAA CAACCTATCG TGATATGAAT CATACCATTA CGAAATTGAT 10980 

TACGAATGAT TakGATTTGT CATTTAGCGA CAATTCTAAG TCACAAaTAG AACGTCAACA 11040 

AACAATTGAG 11050 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 983 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
OGACATAACG AGGCAAGGGT ACATGATACT TTAGCCTCGT TTTT6ATATG TATTTTTCTG 60 
AATATAAGGG CAATAGATGG TATTTTATAw TTTTTTTAAG GTAGTGATTA ACATAGATAT 120 
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GAATGGTTTC TTCGAAGATA TCATACATAC AAAGGTAAAT GTAGAGGATA AACAAATATA 660 

TAGTGATTTA AAAAATGATA TTGATCAATA TGCGCAAAAG TTGTCGTTTA ATCAATTAAT 720 

5 

TTTGATGTTT GATCAACTGA CGGAAGCACA TAAGAAATTG AmTCAAAATG TAAATCCAAC 780 

GCTTGTATTT GAACAAATCG TAATTAAGGG TGTGAGTTAG ATGCCAAATG TAATAGGTGT 840 

TCAGTTTCAA AAAGCGGGAA AATTAGAATA TTATACACCT AATGATATAC AAGTAGATAT 900 

10 

AGAAGACTGG GTAGTTGTCG AATCTAAAAG AGGCATAGAG ATAGGTATTG TTAAAAATCC 960 

ATTAATGGAT ATTGCTGAAG AGGATGTTGT GTTACCTCTT AAAAATATTA TTCGCATTGC 1020 

,5 TGATGACAAA GATATTGATA AATTTAATTG TAATGAACGA GATGCTGAAA ATGCATTAAT 1080 

ACTATGTAAA GACATTGTAA GAGAACAAGG TTTGGACATG CGTTTAGTCA ATTGCGAATA 1140 

TACATTAGAT AAATCGAAAG TTATTTTTAA TTTTACGGCG GATGATCGTA TTGATTTTAG 1200 

20 AAAATTAGTA AAAATATTAG CGCAACATTT AAAAACACGT ATCGAGTTGA GACAAATTGG 1260 

TGTAAGGGAT GAAGCCAAAT TGCTTGGCGG TATCGGACCT TGTGGTAGGT CGTTATGTTG 1320 

TTCTACATTT TTAGGGGATT TTGAACCAGT ATCGATTAAG ATGGCTAAGG ATCAAAATTT 13 80 

25 

ATCATTAAAT CCAACTAAAA TTTCTGGTGC ATGTGGTCGT TTGATGTGTT GTTTAAAATA 1440 

TGAAAATGAC TATTATGAGG AAGTACGTGC ACAATTACCT GATATTGGTG AAGCAATTGA 1500 

AACGCCTGAT GGTAACGGGA AAGTAGTTGC TTTAAATATA TTAGACATTT CTATGCAGGT 1560 

30 

GAAGCTTGAG GGACATGAAC AGCCACTTGA ATATAAATTA GAAGAAATAG AAACTATGCA 1620 

TTAAGGAGGC ATTATTACAT TTGGATCGCA ATGAAATATT TGAAAAAATA ATGCGTTTAG 1680 

35 AAATGAATGT CAATCAACTT TCAAAGGAAA CTTCAGAATT AAAGGCACTT GCAGTTGAAT 1740 

TAGTAGAAGA AAATGTAGCG CTTCAACTTG AAAATGATAA TTTGAAAAAG GTGTTGGGCA 1800 

ATGATCAACC AACTACTATT GATACTGCGA ATTCAAAACC AGCAAAAGCT GTGAAAAAGC 1860 

^ CATTACCAAG TAAAGATAAT TTGGCTATAT TGTATGGAGA AGGATTTCAT ATTTGTAAAG 1920 

GCQAATTATT TGGAAAACAT CGACATGGTG AAGATTGTCT GTTCTGTTTA GAAGTTTTAA 1980 

GTGATTAATC AAGCACACTC AAATAGTGTT ATAATTATAA ATGAATATGG TTTGGATAAG 2040 

45 

TCTGAGACAA TGCATGTTTC AGGCTTTAAT TGTGTATAAA GTTTTGGTGA TTGCATAAGA 2100 

GATGGCGGTA CTAAATGTTA TTATTAAGTG TGCACGCAgT ATCaTTAGTT ATAAAATGTA 2160 

GCTGTTAAAA GTCAAAAATA CATCGAATGT AGTTAOOCAT ATAATATAAA AAGAGTTTTC 2220 

SO 

AATTACTCAA TAGAAAAAGG TTGTCTTCAT AGGAGTTAAA AATGTTAAAA GAGAATGAAC 2280 

GATTTGATCA ACTAATCAAA GAAGATTTTA GTATTATTCA AAATGATGAT GTTTTTTCAT 2340 

55 



585 



EP0 786 519 A2 



10 



IS 



TGGACTTATG TTCAGGCAAT GGGGTGATAC CCTTGTTATT GTTTGCGAAA CATCCACGAC 24 50 

ATATAGAAGG TGTTGAGATT CAAAAAACAC TTGTCGATAT GGCGCGACGC ACATTTCAAT 2520 

TCAATGATGT TGATGAATAT TTAACAATGC ATCACATGGA TTTGAAAAAC GTTACTAAAG 2580 

TATTTAAACC TTCACAATAT ACTTTAGTAA CGTGTAATCC GCCTTATTTT AAAGAGAATC 2640 

AGCAACACCA ACATCAAAAA GAAGCACATA AGATAGCGAG ACATGAGATT ATGTGTACAC 2700 

TTGAAGATTG CATGATTGCA GCCCGTCATT TATTAAAAGA AGGTGGCAGG CTAAACATGG 2760 

TACATCX5TGC AGAGAGACTA ATGGATGTCT TGTTTQAAAT GAGAAAAGTG AATATTGAAC 2820 

CTAAGAAAGT CGTTTTTATA TATAGTAAAG TAGGGAAATC AGCACAAACG ATAGTAGTAG 2880 

AAGGTCGAAA AGGTGG/^T CAAGGTTTAG AAATCATGCC CCCATTTTAT ATTTATAATG 2940 

AAGATGGTAA TTATAGCGAA GAAATGAAGG AAGTATATTA TGGATAGTCA TTTTGTATAT 3000 

20 ATTGTAAAAT GTAGTGATGG AAGTTTATAT ACAGGATACG CTAAAGACGT TAATGCACGT 3060 

GTTGAAAAAC ATAACCGAGG TCAAGGAGCC AAATATACGA AAGTAAGACG TCCGGTGCAT 3120 

TTAGTTTATC AAGAAATGTA TGAGACAAAG TCTGAAGCAT TGAAGCGTGA ATATGAAATT 3180 

AAAACTTATA CCAGACAAAA GAAATTGCGA TTAATTAAGG AGCGATAGTA TGGCTGTATT 3240 

ATATTTAGTG GGCACACCAA TTGGTAATTT AGCAGATATT ACTTATAGAG CAGTTGATGT 3300 

ATTGAAACGT 6TTGATATGA TTGCTTGTGA AGACACTAGA GTAACTAGTA AACTGTGTAA 3360 

TCATTATGAT ATTCCAACTC CATTAAAGTC ATATCACGAA CATAACAAGG ATAAGCAGAC 3420 

TGCTTTTATC ATTGAACAGT TAGAATTAGG TCTTGACGTT GCX5CTCGTAT CTGATGCTGG 3480 

35 ATTGCCCTTA ATTAGTGATC CTGGATACGA ATTAGTAGTG GCAGCCaGAG AAGCTAATAT 3540 

TAAAGTAGAG ACTGTGCCTG GACCTAATGC TGGGCTGACG GCTTTGATGG CTAGTGGATT 3600 

ACCTTCATAT GTATATACAT TTTTAGGATT TTTGCCACGA AAAGAGAAAG AAAAAAGTGC 3660 

TGTATTAGAG CAACGTATGC ATGAAAATAG CACATTAATT ATATACGAAT CACCGCATCG 3720 

TGTGACAGAT ACATTAAAAA CAATTGCAAA GATAGATGCA ACACGACAAG TATCACTAGG 3780 

GCGTGAATTA ACTAAGAAGT TCGAACAAAT TGTAACTGAT GATGTAACAC AATTACAAGC 384 0 

ATTGATTCAG CAAGGCGATG TACCATTGAA AGGCGAATTC GTTATCTTAA TTGAAGGTGC 3900 

TAAAGCGAAC AATGAGATAT CGTGGTTTGA TGATTTATCT ATCAATGAGC ATGTTGATCA 3960 

TTATATTCAA ACTTCACAGA TGAAACCAAA ACAAGCTATT AAAAAAGTTG CTGAAGAACG 4020 

ACAACTTAAA ACGAATGAAG TATATAATAT TTATCATCAA ATAAGTTAAT CACTTTATCG 4080 

ATTaTATGAA ATTTTAAACG ATTTTATAAA CGCAAGCTGT AATTTTAAAT GGTAAGTTAT 4140 
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GTTTTTTAAT GTAAAATAAA TACATTGAAA GTAATAAATA CCTTAACATT GAATAAGATG 4260 

AAAATGAGAT GACGAGATAA ATGTTCGCGT CCGTTGAAAT GCATAGAAAT CTTAGATATT 4320 

ATTTGAAGTG AGACATTACG AGGAGGAACA GTTATGGCTA AAGAAACATT TTATATAACA 4380 

ACCCCAATAT ACTATCCTAG TGGGAATTTA CATATAGGAC ATGCATATTC TACAGTGGCT 4440 

GGAGATGTTA TTGCAAGATA TAAGAGAATG CAAGGATATG ATGTTCGCTA TTTGACTGGA 4500 

ACGGATGAAC ACGGTCAAAA AATTCAAGAA AAAGCTCAAA AAGCTGGTAA GACAGAAATT 4560 

GAATATTTGG ATGAGATGAT TGCTGGAATT AAACAATTGT GGGCTAAGCT TGAAATTTCA 4620 

15 AATGATGATT TTATCAGAAC AACTGAAGAA CGTCATAAAC ATGTCGTTGA GCAAGTGTTT 4680 

GAACGTTTAT TAAAGCAAGG TGATATCTAT TTAGGTQAAT ATGAAGGTTG GTATTCTCTT 4740 

CaSGATGAAA CATACTATAC AGAGTCACAA TTAGTAGACC CACAATACGA AAACGGTAAA 4800 

ATTATTGGTG GCAAAAGTCC AGATTCTGGA CACGAAGTTG AACTAGTTAA AGAAGAAAGT 4860 

TATTTCTTTA ATATTAGTAA ATATACAGAC CGTTTATTAG AGTTCTATGA CCAAAATCCA 4920 

GATTTTATAC AACCACCATC AAGAAAAAAT GAAATGATTA ACAACTTCAT TAAACCAGGA 4980 

CTTGCTGATT TAGCTGTTTC TCGTACATCA TTTAACTGGG GTGTCCATGT TCCGTCTAAT 5040 

CCAAAACATG TTGTTTATGT TTGGATTGAT GCGTTAGTTA ACTATATTTC A6CATTAGGC 5100 

TATTTATCAG ATGATGAGTC ACTATTTAAC AAATACTGGC CAGCAGATAT TCATTTAATG 5160 

GCTAAGGAAA TTGTGCGATT CCACTCAATT ATTTGGCCTA TTTTATTGAT GGCATTAGAC 5220 

TTACCGTTAC CTAAAAAAGT CTTTGCACAT GGTTGGATTT TGATGAAAGA TGGAAAAATG 5280 

55 AGTAAATCTA AAGGTAATGT CGTAGACCCT AATATTTTAA TTGATCGCTA TGGTTTAGAT 5340 

GCTACACGTT ATTATCTAAT GCGTGAATTA CCATTTGGTT CAGATGGCGT ATTTACACCT 5400 

GAAGCATTTG TTGAGCGTAC AAATTTCGAT CTAGCAAATG ACTTAGGTAA CTTAGTAAAC 5460 

CGTACGATTT CTATGGTTAA TAAGTACTTT GATGGCGAAT TACCAGCX3TA TCAAGGTCCA 5520 

CTTCATGAAT TAGATGAAGA AATGGAAGCT ATGGCTTTAG AAACAGTGAA AAGCTACACT 5580 

GAAAGCATGG AAAGTTTGCA ATTTTCTGTG GCATTATCTA CGGTATGGAA GTTTATTAGT 5640 

AGAACGAATA AGTATATTGA CGAAACAACG CCTTGGGTAT TAGCTAAGGA CGATAGCCAA 5700 

AAAGATATGT TAGGCAATGT AATGGCTCAC TTAGTTGAAA ATATTCGTTA TGCAGCTGTA 5760 

TTATTACX3TC CATTCTTAAC ACATGCGCCG AAAGAGATTT TTGAACAATT GAACATTAAC 5820 

AATCCTCAAT TTATGGAATT TAGTAGTTTA GAGCAATATG GTGTGCTTAA TGAGTCAATT 5880 

ATGGTTACTG GGCAACCTAA ACCTATTTTC CCAAGATTGG ATAGCGAcGG AnAATTGCAT 5940 
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AACCTCAAAT TGATATTAAA GACTTTGATA 
ATGCTGAACA TGTTAAGAAG TCAGATAAGC 

5 

AACAAAGACA AATTGTATCA GGAATTGCCA 
AAAAAGTAGC AGTTGTTACT AACCTGAAAC 
GTATGATATT ATCTGCTGAA AAAGATGGTG 

10 

TTCCAAATGG TGCAGTGATT AAATAACTGT 
TAATCGATAC ACATGTCCAT TTAAATGATG 
^5 TTACACGTGc TAGAGAAGCA GGTGTTGATC 

CAATTGAACG CGCGATGAAA TTAATCGATG 
GGCATCCAGT TGACGCAATT GATTTTACAG 
CTCAGCATCC AAAAGT6ATT GGTATTGGTG 
CTCCTGCAGA TGTTCAAAAG GAAGTTTTTA 
AGTTACCAAT TATCATTCAT AACCGTGAAG 
AGGAGCATGC TGAAGAGGTA GGCGGGATTA 
CAGATATTGT AACTAATAAG CTGAATTTTT 
AAAATGCTAA ACAGCCTAAA GAAGTTGCTA 

30 

AAACCGATGC ACCX?rATCTT TCX3CCACATC 
GAGTAACTTT AGTAGCTGAA CAAATTGCTG 
35 GCGAACAAAC AACTAAAAAT GCAGAGAAAT 

GAGAAAGATC ACCGCCATAA ATGTAAACGA 
rrCTCACTTT TTTAAATTAA AATATCGTGC 
AGCTTTGAAA TTAAGAATTG TAGGAAGGCG 
GTAGAAGGAC GAGATGATAC TGAGCGTGTT 
ACGAATGGTA GTGCCATCAA CGAACAAACT 

45 

CGAGGCGTTA TTGTATTAAC AGATCCAGAT 
ACTGAACATG TCAAAGGTGT TAAACATGOS 
AAAGGGAAAA TTGGTGTTGA ACATGCCGAC 

SO 

GTTAGTTCAC CCTTTGATGA AGCTTATGAA 
GGGTTAATTG TTGGGAAAGA TGCAAGGC6C 

55 



AAGTTGAAAT 


TAAGGCAGCA ACGATTATTG 


6060 


TTTTAAAAAT 


TCAAGTAGAC 


TTAGATTCTG 


6120 


AATTCTATAC 


ACCAGATGAT 


ATTATTGGTA 


6180 


CAGCTAAATT AATGGGACAA AAATCTGAAG 


6240 


TATTAACCTT AGTAAGTTTA 


CCAAGTGCAA 


6300 


ATTTTTAAAA ATTAGGAGAG ATAATTATGT 


6360 


AGCAATACGA TGATGATTTG AGTGAAGTGA 


6420 


GTATGTTTGT AGTTGGTTTT AACAAATCGA 


6480 


AGTATGATTT 


TTTATATGGC 


ATTATCGGTT 


6540 


AAGAACACTT 


GGAATGGATT 


GAATCTTTAG 


6600 


AAATGGGATT 


AGATTATCAC 


TGGGATAAAT 


6660 


GAAAGCAAAT 


TGCTTTAGCT 


AAGCGTTTGA 


6720 


CAACTCAAGA 


CTGTATCGAT ATCTTATTGG 


6780 


TGCATAGCTT 


TAGTGGTTCT 


CCAGAAATTG 


6B40 


ATATTTCATT AGGTGGACCT 


GTGACATTTA 


6900 


AGCATGTGTC AATGGAGCGT 


TTGCTAGTTG 


6960 


CGTATAGAGG GAAGCX3AAAT 


GAACCGGOGA 


7020 


AATTAAAA6G 


CTTATCTTAT 


GAAGAAGTGT 


7080 


TGTTTAATTT 


AAATTCATAA 


AGTTAAAAGT 


7140 


T6CTATATTC 


GTTTAATATG 


CTATGGTTCT 


7200 


ATGTGGAATA 


CGTGCGATAG 


AGATGGTTAG 


7260 


TTTTAAATGA 


AAATCAATGA 


GTTTATAGTT 


7320 


AAACGAGCTG 


TTGAATGTGA 


TACGATTGAA 


7380 


TTAGAAGTAA 


TTAGAAATGC 


TCAACAAAGT 


7440 


TTCCCAGGAG 


ATAAAATTAG AAGTACAATT 


7500 


TATATTGATA 


GAGAAAAAGC 


TAAAAATAAA 


7560 


TTAATTGATA 


TTAAAGAAGC 


GTTAATGCAT 


7620 


TCAATTGATA AATCTGTGCT AATAGAGTTG 


7680 


CGTAGAGAAA TTTTAAGTAG AAAATTGCGA 


7740 
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GCGGATGTAA GGCAAGCTTT AGAAGATGAA TGAGGAAGTG AAAATGTTGG ATAATAAAGA 7860 

TATTGCAACA CCATCAAGAA CGCGAGCGTT GTTAGATAAA TATGGCTTTA ATTTTAAAAA 7920 

5 

AAGTTTAGGA CAGAACTTTT TGATAGATGT GAATATCATT AATAATATCA TTGATGCAAG 7980 

TGATATTGAT GCACAAACTG GG6TGATTGA AATTGGTCCA GGCATGGGGT CATTGACAGA 8040 

ACAATTGGCC AGACATGCTA AAAGAGTATT GGCATTTGAA ATTGATCAAC GTTTAATACC 8100 

10 

TGTATTAAAT GATACACTAT CACCTTATGA TAATGTGACG GTGATTAATG AAGATATTTT 8160 

AAAAGCGAAT ATTAAAGAAG CTGTTGAAAA TCATTTACAA GATTGTGAAA AAATAATGGT 8220 

IS TGTTGCAAAC CTGCCGTACT ATATTACGAC GCCAATTTTA TTAAATTTGA TGCAACAAGA 8280 

TATACCAATT GATGGCTACG TGGTGATGAT GCAAAAAGAA GTGGGCGAAC GCTTAAATGC 8340 

TGAAGTAGGT TCAAAAGCAT ATGGTTCGTT ATCAATTGTC GTACAATACT ATACAGAGAC 8400 

TAGTAAAGTA TTAACGGTAC CTAAATCTGT ATTTATGCCA CCACCTAATG TTGATTCAAT 8460 

AGTTGTAAAA CTGATGCAGA GAACTGAACC GTTAGTAACA GTAGATAACG AGGAAGCATT 8520 

CTTTAAGTTA GCAAAAGCAG CATTTGCACA AAGAAGAAAG ACAATTAACA ATAACTATCA 8580 

25 

AAATTATTTT AAAGATGGTA AACAACACAA AGAAGTGATT TTACAATGGT TGGAACAAGC 8640 

AGGTATTGAT CCAAGACGTC GCGGTGAAAC GCTATCTATT CAAGATTTTG CTAAATTGTA 8700 

TGAAGAAAAG AAAAAATTCC CTCAATTAGA AAATTAAATG ATTGACAAAG CAAAGCACTA 8760 

30 

TTGTTAAAAT TTAAATTTTG TTTGACGAAA ACGTTGCAAA TATGGTATTA TGTAACTTGT 8820 

AGCGAGGTGG AGCAATATGC CAAAATCAAT TTTGGACATC AAAAATTCTA TTGATTGTCA 8880 

35 TGTAGGAAAT CGTATTGTAC TGAAaGCCAA TGGAGGCCGT AAGAaAACAA TAAAACGTTC 8940 

TGGAATTTTA AAAGAAACAT ATCCGTCAGT TTTCATTGTT GAGTTAGATC AAGACAAACA 9000 

CAAC?TTGAG AGAGTATCTT ATACATACAC TGATGTGTTA ACTGaAAATG TTCAAGTTTC 9060 

ATTTGAAGAG GATAATCATC ACGT^TCAAT TGCACACTAA ATAAGACATA TAGAGATGTT 9120 

AGACGTTTCT TAGTATAAGA AGTAAATATT ATGATAATTA TTTGAGTGTT GGGcATTATG 9180 

TTCAATACTC TTTTTATTTA CAAAATGTTT AACACTGATG TTTCGCTTAT AGATTTTTCA 9240 

45 

GTAAATGGAT AATTGTATTT ATAAACACAA ATACAAGTAA ATACTAAGTA ATTAGATGGA 9300 

GAAAATTACT TTTTTATTAA AAAAACACTA AAAAACAAAT TAAAATGTCA AATATTAATT 9360 

CTCTTTATGT TAAAATCATC ATATTAAGAT AACGAAAAGA GGGCGGAAAA TGATATATGA 9420 

SO 

AACGGCACCA GCCAAAATTA ATTTTACGCT CGATACACTT TTTAAAAGAA ATQATGGCTA 94 80 

TCATGAGATT GAAATGATAA TGACAACAGT TGATTTAAAT GAT€X?rTTAA CTTTTCATAA 9540 
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10 



15 



20 



25 



30 



AAATCTCGCA TATCGTGCAG CGCAACTATT TATTGAGCAA TATCAACTAA AGCAAGGTGT 
AACAATTTCT ATCGATAAAG AAATACCTGT TTCTGCTGGC TTAGCTGGAG GTTCGGCTGA 
TGCAGCAGCA ACGTTAAGAG GATTGAATCG ACTTTTTGAT ATAGGGGCGA GTTTGGAAGA 
ATTGGCTCTA CTAGGCAGTA AAATCGGGAC AGATATTCCG TTTTGTATTT ATAATAAAAC 
TGCACTATGT ACTGGAAGAG GAGAGAAAAT CGAGTTTTTA AATAAACCAC CTTCAGCTTG 
GGTGATTCTT GCTAAACCAA ACTTAGGCAT ATCATCACCA GATATATTTA AGTTGATTAA 
TTTAGATAAG CGTTACGACG TACATACGAA AATGTGTTAT GAGGCCTTAG AAAATCGAGA 
TTATCAACAA TTATGTCAAA GTTTGTCTAA TCGATTAGAG CCAATTTCT6 TTTCAAAACA 
CCCACAAATC GATAAATTAA AAAATAATAT GTTGAAAAGT GGTGCAGATG GTGCGTTAAT 
GAGTGGAAGC GGACCTACTG TGTATGGGCT AGCACGAAAA GAAAGCCAAG CAAAAAATAT 
TTATAATGCA GTTAACGGTT GTTGTAATGA AGTGTACTTA GTTAGACTAT TAGGATAGAA 
GGGTTGAAAA GATGAGATAT AAACGAAGCG AGAGAATTGT TTTTATGACG CAATATTTGA 
TG 

(2) INFORMATION FOR SEQ ID NO; 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5614 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10322 



35 



40 



45 



SO 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

GATTGATTAA ATGTTTTAAT CCACTTCAAT GCCTTCGATA AACTCTACAA TCGCGCTATT 60 

CATATAATTA TTCGATTTCA TTTGTTCAGC ATATGTCTCA TTAAATCCAG ACATAACTTT 120 

TTTAAAWGCG AAAATTGAAA TTGGTATCGT TACTAATAAG GCACTAGCCA TACGCCAATC 180 

AATGAGCATT ATGTATAAAA AGATAGCAGC TGACAAAAGT AAGTTTCCTA TAACTTCAGG 240 

AATCATATGT GCTAAAGGTA ATTCTATTGT TTCAACCTTA TCGACAAATA TATTTTTTAA 300 

TTCACCTATT TTCTTAGATT CCaCTACGCC TAAAGGGAGA CGCATTAATT TTTGAGCTAA 360 

TTTTTTACGA ATTTCAGATA AAATTTCATA TGCCGTAATA TGTGATAGCA TCGTTGACGC 420 

TCCAAAACAA CACACTTGTG AAATATAAGC GATTAAAGCA ATAAAGATAT AAACCATAAT 480 

CGAATTAATC GTATATGTAT TGTTAATCAT CATTAAAATA ATTTTAAATA CTGCCCAATA 540 

AGGAACTAAT CCAGAAAAGA CACTGATGAT AGACAACAAA ATTGATAACA TAATTTTCCA 600 
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10 



ATATGTAACT CCTkTCMTT AATAATCTAA ATTAAGCCGC TTATATTATT TATTTCACTG 720 

GATGATATAC ATAATATAAA TTTGTTATTT GTTAAAAATT AATACTTATT ACAAGTACAT 780 

CATATATTAG TTGATAACGA TTATCAATGT CGCGTGGATT TGTGACACAT TTCTTTTAAA 840 

AATTCACAAG GTTATGGGGC AGAAATGATA AAGAGCCACT AATGATTTAT TATGTAGTGG 900 

TTCTGGGAGT GGGACAGAAA TGATATTTTC ACAAAATTTA TTTCGTCGTC CCACCCCAAC 960 

TTGCATTGTC TCTAGAAATT GGGAATCCAA TTTCTCTTTG TTGG6TCCCT GAATATAGCC 1020 

TTGTAGAGTC TAGTACATTG ATTT6TATCC CAATGTCCCT ATAATTGATT ATTOGCTTTA 10 BO 

75 TCTAATGATC CTATGACTCA ACTATTAAAT CATTTTTCGA AATACTTAAT TCTAATATAA 1140 

TTAAATTCAT TTATTGTAAT ATTGCAAAAA TACATTGCAC ACCTT6TTCA TCAATGCTAT 1200 

AATTAATTAC ATAATAAATT GAACATCTAA ATACACCAAA TCCCCTCACT ACTGCCATAG 1260 

TGAGGGGATT TATTTAGGTG TTGGTTATTT GTCACCTTTT TTATTGTTGC GCGTTCGTAA 1320 

CCAATGTGCA AAAAACGCAA CAAGACAGCC GCTTATAGCT GAAGTCATGA TGTTAATTAA 1380 

TAAATTGAAC ATCCGTCATA CACCTCCTCT CTGCGTTAAA GTAACGCCCG AGATGTTAGG 144 0 

CGACCATCAT ATTATATCAT TTATTTATTA TATTTCACGC AATATTAAGG CTTAAGTAAA 1500 

GTrTTTTTTA GTGGTTTACG CTACTTTAAT TGCTATCTTT TAAAATCCAT TTAGATAATA 1560 

TAAATGTGAT GGGTATCGTA ATAATTAAAC CAGCAAATGG TGCAATTTCT GCTGGCAAAT 1620 

TTAGCCAGGA TACAAATACA TATAATAAAA CTGTTTGTAA GCTTACGTTG ACAATCTGCXj 1680 

TAATTGGAAA ACTAATGAAT TTTCTCCAAG TAGGTTTTAC CCTGTAAACA AAATAACAAT 1740 

35 TCAAATAATA TGAAATCACA AAAGCGACTA GAAATCCGGT AATATGACTA ATCATATATT 1800 

CAATGTGTAA TAATTTTAAC AGCAATAAAT AGACAACATA ATAATTTAAC GTATTAATGC 1860 

CGCCAACAAT GATAAATTTT AAAATTTCAG CATGCGTTTG TGTTAGTTTC ATATGTGTAC 1920 

40 TCCTCAACAT CAAAATATAT GCATAACTAC GTTCTCGAAC ATACTCGAAT ATGCGAGCCA 1980 

ATCCGCTTCA CTTCAAATAT GCTTATTTCA ATCTTTATAC CCTTTCACAG CAAATTTAGT 2040 

CTCTTTCCCC TCATCCTTAT ACGCCATTAT AATGTAACTG ATTTATCGCG TGACTCATTA 2100 

GCACTATAGA GATTACTTTA GTTCACTAGT AATTTTATAT ACAATAAGAG CGACAACAGT 2160 

AATGAGAGGA TGTCTACTAT GCAATTACAA AAAATTGTCA TCGCTCCTGA CTCATTTAAG 2220 

GAAAGTATGA CCGCACAGCA AGTTGGCAAT ATTATAAAAC AGGCTTTTAC TAATGTTTAT 2280 

GGGAATACCC TTCATTATGA TATCATTCC6 ATGGCTGATG OTGGTGAAOG TACCACAGAT 2340 

GCTTTAATGC ATGCAACAGG TGCCACTAAG TATACAGTCA TCGTTAATGA CCCTTTAATG 24 CO 

55 



25 



30 



45 



SO 



591 



GCGGCAGCGT CAGGTTTGGA TTTATTAGAA 
TCATATGGTA CCGGTGAACT AATTAAAGAT 
^ TTAGGGATTG GTGGCAGTGC AACAAATGAT 

GTAAAGTTTA CTGATGTAAA CGGGGACTTA 

ATTGCACAAA TCGATATAAC CAATCTAGAT 

10 

GCCTGTGATG TTTCAAATCC TTTATTGGGT 

CAAAAAGGCG CTGATGCAAA GATGATACCA 

GATAAGATAA AAATGTGCAC AGGAAAGTCC 

75 

GGCGGTATGG GCGCAGCATT ATTAGCGTTT 
GTCGTCTTTG ACATTACAGA TTTTCATCAA 

20 GGAGAAGGAC GCATGGATTA TCAGACCATC 

GCTGCAAAAC AATATCATAT TCCTGTCATC 
CAACATGTTT ACGATTTCGG TATTGATAGT 

25 TTAGAAGATG TCCTACAAAA TAGCGAACAA 

CGTATTCTGA AATTACAATA ATGTCAAAGT 
ACTTGAATGA GGTGAAACCC ATGAAAAGAA 

30 

ACAATCAAAA CCAAAATCAT CGTCGTCAAT 
CTAAAGGCGA TCCTGAAGAA CACCCGGAAC 
AACAAATTCT TGAAGAAGAA AACGAGAAAT 

35 

TTATTGCCAT TCTCTTAATT ATTGTCGCTA 

ATAGCGATAA agttagtaat gaccctaaag 

ATCAAGACGG CCAAATTAAC CAGCAAGTAG 

AAAAAACTGA tgacattatt aaaaatttac 

AACAAAACAA AGCTGATTCT AAGCTAACTC 
^ CAGAGGCAAA TAATGCACTT AAAAACAATG 

ATGATATTAA TACAAAATTC GACAGTATTA 
ACAATGGTGG CGCTAATTAA TTATTACACC 

SO 

CTTTATCTGT ATCACTACGT TATTCGTGAT 
TAAACTTGTA TTCTAACTAC ATACAAATAC 

55 
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AAAGAGGAAC GTAATCCTTT ATACACATCA 2520 

GCATTAAATC ATGGTGCTAA GACCATTATT 2580 

GGTGGTACAG GTATGCTAAG TGCACTAGGC 2640 

TTACAAATGA ATGGTGCTAA TCTTGCTCAC 2700 

TCX3CGATTAA AAGAGGTGAC CTTTAAAGTG 2760 

GAAAATGGTG CTACCTATAT TTATGGTCCT 2820 

AAGTTGGATT TCGCAATGTC GCATTATCAT 2880 

GTTAATCAAA TACCAGGTTC TGGTGCAGCT 294 0 

TGTGAGACAA CTTTAACAAA AGGTATTGAT 3000 

AGAATTAAAG ATGCAGACCT CGTTATTACT 3060 

TTTGGTAAAA CACCCGTAGG CGTTGCGTTA 3120 

GCGATTTGTG GCAGTCTAGG CGAAAATTAT 3180 

GCCTATTCTA TAATCTCTTC ACCTAGCACT 3240 

AATTTATTAA ACACTGCAAC TGACATTGCT 3300 

AAATCATCAG CTTTATTATT TGCAGTTAAA 3360 

CTGATAAATA CCGTGATTCA TATCAATACG 3420 

CTGAAGACGC ATCGTATAGA CAACAATATG 34 80 

GATACTATAA TGGTAGAGAT TATCGAAGAG 3 54 0 

CCCGCCGTTC AAAAAAATGG TTATATATCA 3600 

TTTTTGTCAC ACGCX5CCTTA CTTAACAATG 3660 

TCTCTCAAAA TTATAAAAAA CAAGTTGAAA 3720 

ATAATGCTAA AGAAAATATT AAAAACAACC 3780 

AAAATCAAAT CGACAACTTG AAGCAGCAAG 3840 

AATTTTATCA AGACCAAATC AACAAATTGA 3900 

CAAGCCAAGG TAAAATTGAA AGCATGTTAA 3960 

AATCTAAATT AGAAAGCTTA TTTAAAGATG 4020 

TGCTTTGATG ATAAACATTA ATTCCCTATA 4080 

GATGCATTAA GAGTATAGGG ATTTTTTATA 414 0 

ACACAAAACG TATATAATTT ATATAATTAT 4200 
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TTATTGCTAA TTACGTTAGG 


CGTCATGACC GCTTTTGGCC 


CACTAACTAT AGATATGTAC 


4320 




GTACCATCAT 


TACCTAAAGT 


GCAAGGTGAT TTTGGTTCTA 


CTACATCAGA 


AATTCAATTA 


4380 


5 


ACATTATCAT TCACAATGAT 


TGGTCTTGCA CTAGGCCAAT 


TTATCTTTGG 


ACCTTTATCC 


4440 




GATGCTTTTG 


GTCGCAAACG 


GATTGCTGTA TCCATTTTGA 


TCATTTTCAT 


TTTGGTATCA 


4500 




GGTTTGTCTA 


TGTTTGTTGA 


TCAATTGCCA TTATTCTTAA 


CTTTACGATT 


TATTCAAGGT 


4560 


ifi 

lU 


TTAACTGGTG 


GTGGCX3TCAT 


CGTGATTGCA AAAGCCTCTG 


CTGGTGATAA 


ATTTAGTGGC 


4620 




AACGCACTCG 


CTAAATTTTT 


AGCATCTTTA ATGGTAGTTA ATGGCATCAT 


CACTATTCTT 


4680 


IS 


GCACCATTAG 


CCGGTGGATT 


AGCTTTATCC GTAGCAACAT GGCGTTCTAT 


TTTCACAATT 


4740 


TTAACTATTG 


TGGCACTCAT 


CATTTTAATT QGCGTCGCTT 


CTCAATTACC 


TAAAACATCT 


4800 




AAAGATGAAT 


TAAA6CAGGT 


GAATTTTAGT AGCGTCATTA AAGATTTTGG AAGTCTTTTG 


4860 


20 


AAAAAACCAG 


CATTTATTAT 


TCCAATGCTA TTACAAGGwT 


TAACTTATGT 


AATGCTATTT 


4920 




AGTTATTCAT 


CTGCATCGCC 


ATTTATTACT CAAAAATTGT 


ATAATATGAC 


ACCCCAACAA 


4980 




TTTAGTATCA 


TGTTTGCTGT 


TAACGGTGTA GGTTTAATCA 


TTGTCAGTCA 


AGTCGTTGCT 


5040 


25 


TTATTAGTAG 


AAAAATTACA 


TCGCCACATA TTATTAATCA 


TTTTAACTAT 


TATACAAGTG 


5100 




GTAGGTGTTG 


CTTTAATTAT 


CCTGACACTT ACATTCCATT 


TACCACTTTG 


GGTCTTACTC 


5160 




ATCGCATTCT 


TCTTAAATGT 


GTGTCCTGTG ACX5TCAATTG 


GACCGCTTGG 


TTTCACAATG 


5220 


30 


GCTATGGAAG 


AACGAACAGG 


TGGCAGTGGT AACGCATCAA 


GTTTACTTGG 


CTTATTCCAA 


5280 




TTTATCTTAG 


GTGGCGCTGT 


TGCACCATTA GTTGGCTTAA 


AAGGCGAATT 


TAATACATCA 


5340 


35 


CCATATATGA 


TTATTATCTT 


CATTACAGCC ATTCTATTAG 


TCAGTCTACA AATCATTTAC 


5400 


TTTAAAATGA 


TTAAAAAGCA 


ACATGTCGCA TAACACTTCA ACATAATTAG 


AACCCTAGCA 


5460 




AAGM'ATCTA 


TCTTTGTCAG 


GGTTCTTCTT TATGAATTAT 


GAGATCGAAT 


CTTCAACTAA 


5520 


40 


AATTACGCCT TCATAGCAAG 
TcTGTAATAT ATTTTTCACT 


GACATTTCTA TTCAATCACC 
TGTAGTATCA CCAT 


CTTTAACAGG 


CATCCAAATT 


5580 
5614 




(2) INFORMATION FOR S£Q ID NO: 100: 









45 ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9179 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

SO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
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AAAGACAATG ATATGAAGTA TATGGATATC ACAGAaAAAG TGCCAATGTC GGAATCTGAA 120 

GTTAACCAAT T6CTAAAAGG TAAGGGGATT TTAGAAAATC GAGGGAAAGT TTTTCTAGAA 180 

GCTCAAGAAA AATATGAGGT TAATGTCATT TATCTTGTTA GCCATGCATT AGTAGAAACA 240 

GGTAACGGCA AATCAGAATT AGCAAAAGGC ATTAAAGATG GGAAAAAACG CTATTACAAC 300 

TTTTTTGGTA TAGGAGCATT CGATAGTAGT GCTGTTCGTA GTGGGAAAAG TTATGCTGAA 360 

AAGGAACAAT GGACATCACC AGATAAGGCG ATTATTGGTG GTGCAAA6TT CATTCGTAAT 420 

GAATATTTTG AAAACAATCA ACTGAATTTA TATCAAATGC GATGGAATCC AGAAAATCCT 480 

GCGCAACATC AATATGCGAG TGACATTCGC TGGGCAGATA AAATTGCCAA ATTAATGGAT 540 

AAATCCTATA AGCAGTTTGG TATAAAGAAA GATGATATTA GACAAACATA TTATAAATAA 600 

GACATCX5GTG CTTAAAGGAG CTGGAACAAT TTATTGTTTC GAGCTCCTTT AGCGCATTCT 660 

20 GAGTGTGTTA GTTAAATGGA TTTTAACCTA ACAAAAAACG CTATATAGCA TCAAATATGC 720 

TATATCCCAC ATCATTGTTA CAAATGTACA TGATGTAAAT GAATATTGCT GTCTAAATGT 780 

GCATGTAATA TACAATGGTG CAGATAATAC ACTTAAGTCC TTAAAAATGA AACGTTAgTT 840 

CCAAGAGTCA TTTTTAAACA ATAGTGCATG TGATAAAATA GAAAAGAATG AAAAATATAG 900 

AGGTGACAAT ATGAAGATAG CAATTATAGG TGCAGGCATC GGTGGATTAA CAGCTGCTGC 960 

ATTATTACAA GAACAAGGTC ATACTATTAA AGTCTTTGAA AAAAATGAGT CAGTTAAAGA 1020 

AATTGGCGCT GGGATTGGTA TCGGAGATAA TGTGCTTAAA AAACTAGGTA ATCATGACTT 1080 

AGCTAAAGGT ATTAAAAATG CTGGGCAAAT CTTATCTACA ATGACAGTGT TAGATGACAA 1140 

AGATCGCCTG TTAACTACTG TTAAATTAAA AAGTAATACA TTGAATGTGA CGTTACCAOG 1200 

CCAAACATTA ATTGACATTA TTAAATCTTA TGTAAAAGAT GACGCAATAT TTACAAATCA 1260 

TGAAGTCACG CATATAGATA ATGAGACAGA TAAAGTTACC ATACATTTCG CGGAACAAGA 1320 

AAGTGAAGCA TTTGATTTAT GTATTGGTGC TGATGGAATT CATTCTAAAG TGAGACAATC 1380 

TGTAAATGCT GACAGTAAAG TATTATATCA AGGGTATACA TGCTTTAGAG GTTTAATTQA 1440 

TGATATTGAT TTAAAGCATC CGGaTTGTGC AAAAGAATAC TGGGGaAGAA AAGGaAGAGT 1500 

45 AGGTATTGTT CCGTTATTAA ATAATCAAGC ATATTGGTTC ATTACAATTA ACTCGAAGGA 1560 

AAACAATCAT AAATATAGTT CGTTTGGTAA ACCTCATTTG CAAGCATACT TTAATCACTA 1620 

TCCAAATGAA GTTAGAGAGA TCTTAGACAA ACAAAGTGAA ACAGGTATCT TATTGCATAA 1680 

SO 

TATTTATGAT TTGAAACCAC TCAAATCTTT TGTTTATGGT CGTACTATTT TACTAGGAGA 1740 

TGCAGCACAT GCGACAACGC CTAATATGGG GCAAGGTGCT GGACAAGCAA TX5GAAGATCC 1800 

55 



2S 



30 



35 
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10 



IS 



20 



25 



30 



35 



SO 



TAAAATACGT GTCAAACATA 


CTGCAAAAGT 


AATTAAGCGT 


TCTAGAAAAA 


TCGGTAAAAT 


1920 


TGCCCAATAT 


CGTAGTCGTT 


TATTTGTTGC 


AGTTAGAAAT 


CGTATTATGA 


AAATGATGCC 


1980 


AAATGCATTA GCAGCTGGAC 


AAACTAAATT 


CTTATATAAA 


TCGAAAGAAA 


AATAATACAA 


2040 


CAATATGAAA 


ACCCCCGTAT 


GTTGAAACGA 


GAGCTCAACA 


TATGGGGGTT 


CTTGTTTTTA 


2100 


TAATGTTATT 


ATAATAAATT 


CAATTATTAG 


TTAACGACAA 


ATTGTGGTTT 


CTCACCTTGA 


2160 


ACGGCACTAA 


TTGCAGCATT AGCAACAATT 


TTAGACATCA 


TGTCACGTGC 


TTCAAATGTA 


2220 


GCATTACCAA TATGCGGTX5T 


TAATACTACA 


TTATTAAGTG 


ATTTTAAGTC 


ATCGGTAATA 


2280 


TCTGGTTCAA ATTCATATAC 


ATCAAGTGCA 


GCACCTTCAA 


TTTCATTATC 


TTTCAATGCT 


2340 


TGCACTAGTG 


CTTGTTCGTG 


CACGATTGGA 


CCACGAGAGG 


CATTGATTAA 


ATACX5CCGTA 


2400 


GATTTCATCA TTTTAAATTG 


TTCTGTATCA ATTAAATGAT 


GCATTTTAGG 


ATTATAAGCA 


2460 


GCGTTGATAG 


TGATAAAATC 


TGCATTCTTT 


AATAGTGTAT 


CTAAATCTAC 


ATATTTTGCA 


2520 


CCGATTTCTC 


6TTCTTTTTC 


TTCTTTGCGA 


TTAGGTCCAG 


TGTATAGCAC 


ATCCATGTCA 


2530 


AATGCTCTTG 


CACGACGAGC 


TACTGCACTA 


CCAATTTCAC 


CTAAACCGAT 


AATGCCGATT 


2640 


GTTTTCCCAG 


ATACTTCTCT 


ACCTCTGAAA 


AATAAAGGTG 


CCCATCCATC 


AAATCCAGTT 


2700 


GTACGTGATA 


ATTGGTCCCC 


TTCAACAATA 


CGACGCGCTA 


CTGCAAGTAC 


TAATCCAATT 


2760 


GTTAAATCAG 


CAGTCGCGTT 


TGTTGATGCT 


TTAGGTGTGT 


TTGTAACATC 


TATACTTTTT 


2820 


TCTCGGGCAT 


ACTCGATATC AATATTATTA AAACCAGCGC 


CATAGTTGGC 


AATGATTTTT 


2680 


AAGTCTTTAC 


CAGCATCGAT 


AACATCTTTA 


TCAACGTTTG 


TAGATAATAA 


ACTAATTAAG 


2940 


GCAGTCGCGT 


TTTTAACACC 


TTTAATTAAA 


GTGTCTTTAT 


CGACTAATCC 


TTTACCTTCA 


3000 


TACATTTCAA 


CTTCAAAATG 


TTCTTGTAAA AGTTTTAAAC 


CTACTTCTGG 


TATtGCACCA 


3060 


gCAACATAAm CTTTTtCCAT AAAAGAtCAC TCCTTTTATC 


TTAGTATAGT 


AGAAGATTAG 


3120 


ACAGTATACA ACTATGTCAT 


GATGTCTTGT 


GTATCAATGA 


TGTAAGCGCG 


TACTTTTGAT 


3180 


GGAGGCGATA 


TAACTTAGGC 


ACTGTAGAAC 


TATGAATATT 


GTAATGTGGA 


AAAACTGGAT 


3240 


CAATTAAATT 


AGATAACGTA 


GTTTTAAAGT 


TAATAGTATT 


AGAAAAAATT 


AATATTTTGA 


3300 


ATATGGGAGG 


AAATATAAAT 


AAGTAGGTGG 


CAACGAAAAA 






J ^ V W 


CTATAAAGGA 


AAGCTCAAAG 


TTTTTTGATG 


ACATATGTAC 


TAGAATTAAG 


TTTCAAGACA 


3420 


ATATGTATCA 


TCGTGTTTAT 


ATTAAATATG 


GATGTAGTTG 


TAGTTACCTG 


CTTCACTTGC 


3480 


AGAAATAGTT 


CTAGAACTTA 


CTGAGAAAGG 


TCCGCCACTA 


TAATTCATTT 


CTGAAATTGT 


3540 


AACTGAACCA 


TCACTGTTTA 


CACTTTCTAC 


ATATGCAACG 


TGACCAAATG 


GTCCTTCAGA 


3600 
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AGCAGCAGCC CAATTATTAG CATTTCCCCA 
ATATACATAC CAAGTACATT GTCCTGCAGT 
^ TGTAGTTGTC GTAGTTGTCG TAGTCGTTGT 

GTAATTTGTA TAATTTTCAG CAGCATCTGC 
GATTCCTGCT GTTAACGTAG TTGCTGTTAC 

10 

TTCTATATCT TTTTTTATAA ATAAAACGTA 
ATGACAATAG TTACTTTAAC AAAATtAATG 
AAGAATAAAA AAACTTTGAC TAATTTTGTA 

15 

AACAGATAAT TAATAGGAAA TATTTATTTG 
GGTATTATAT ATTCTTGGCC ATTATAATAT 
20 GATAATATTG AGAAAGCGAA TATGGATAAA 

ATAATAATGA AATCAATATC TGTAGCAATT 
AAAACGATAG ACCAAATAAT ATAAGAAATC 
TCAACTAGTT TCGATTCATC TTTTTTCAAT 
GTGAATAAAC TTAATAAATA GATAAGCATC 
TTCGGTTGAT GATTTGTTAC GTCGTTCATT 

30 

ATTGTAATAT TATCTTTAAC TATAACAAAA 
AATTATTAAA AATAAAAATA ATTGGTGGAC 
ATATATACTT AACATTTATA ATGAT6CGTA 

35 

CGTATAATTT GTTTTTAATT TTAACCAAAG 
ATTGTAGGAT CAGGAAATGG CGCAGTTACG 

^ GATGTTAAAT TATATTGTCG TAATCAATCT 

GGOGGATTTG ATTTTAATAA TGAAGGTGAT 
GATGATATGG AATATGTTTT AAAAGATGCT 

^ TACATAGAGT ATTATGCTGA TGTAATGGCA 

TTCAACATGG CTGCAGCAAT GGGGTCAATT 

ATTGAAACAA AACCACAACT AGCGGAAgcT 

SO 

TTTGAAAATG CAGCAGTTGA TTTATCTCTA 

GATAGAAGCT GTCTAAAT6A TTGTTATGAC 

55 



AGTAGAACCG ATTTCTCCGC CAACTTTATC 3720 

GTATAAGTTA CCAGAATGTG AAATTGATGA 3780 

AGTTTGAGTC GTGTTGTAGT TATAGTTGTT 3840 

ATGATGTGCT TGACCTACTA ATGCTGTGCC 3900 

TAATTTTTTC ATGAATAAAG TCCTCCAAAG 3960 

GCGACTGTTT TATTCTCACA TCTCGAATTG 4020 

CTTCTTGTGG GGAATGTTAT TGATTTGTAA 4080 

ATAAAAATTA GTCAAAGTTA CAATGAGATT 4140 

TAATATGTTT AAATAAATCG AATTGTTAAA 4200 

TTGACACACG CAATAATTGT GAATACAAAA 4260 

ATACCGATAA ACXTTAATGAT GAAACCTATA 4320 

AGGAAAACGC CTATTAAAGT 6ATAACGACT 4380 

GTATAGTTAA GATAATTTTT TCCAGCACGA 4440 

AACCATATTA TCAGTGGACC AATAATAGAT 4500 

GCCATAATGT TCTCATCATT GGATTTGCGA 4560 

TCAGTTGTCA TATTAGACAC TCCTTTGAAA 4 620 

TATAATCAAA AATAAACATG TTTATTAAAC 4 680 

GTCX5GCGTTT AAATAGGTTA ATTTAAGGTT 4740 

ATGAATTCGC ATCATTTTTA TATTGTCTTA 4800 

ATAGAAAGAG G6TTGTTTAT GAAAATAGCA 4860 

GCAGCAGTAG ATATGGTGAG CAAAGGCCAC 4920 

ATAAGTAAGT TTCAAAACGC AATCGAAAAG 4980 

GAACGTTTCG TAAAATTCAC TGATATTAGT 5040 

GAAATTGTTC AAGTGATTAT TCCATCTTCA 5100 

GAGCATGTAA CTGATAATCA GTTGATATTC 5160 

CGTTTTATGA ATGTTTTAGA AGATAGACAT 5220 

AATACGTTGA CGTATGGTAC GCGTGTCGAT 5280 

AATGTACGTC GTATCTTCTT TTCAACATAT 5340 

AAAGTTTCAA GTATTTATGA TCATTTAGTA 5400 
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CCAACATTAT TGAATGTCGG TCGCATTGAT TATGCTGGCG AGTTCGCTTT ATATAAAGAA 552 0 

GGAATTACTA AACATACAGT TAGATTACTT CATGCAATCG AATTAGAACG TTTGAATTTA 5580 

5 GGCCGTAGAT TAGGTTTTGA ATTATCAACA GCTAAAGAAT CACGTATTGA ACGTGGTTAT 564 0 

TTAGAACGTG ATAAAGAAGA TGAACCATTA AATCGTTTGT TTAATACAAG CCCAGTATTT 5700 

TCACAAATTC CAGGACCAAA TCATGTAGAA AGCAGATATT TAACTGAAGA TATTGCATAT 5760 

10 

GGTTTAGTAC TATGGTCAAG CTTAGGTCGT GTTATTGATG TACCGACACC AAATATAGAT 5820 

GCAGTAATTG TAATTGCATC AACCATTTTA GAGAGAGACT TCTTTGAGGA AGGCTTAACA 5B80 

GTTGAAGAAA TTGGTTTAGA TAAGCTTGAT TTAGAAAAAT ATTTAAAATA AATGATGGCT 5940 

TGAAGATAGA AAAGGATATA GCATTATGCA AAAGCAATAA ATTGAAGAAA AGAGGTTTCT 6000 

CATCAATAAO CGnAGGGGAC GATAGATGAT GAAAAGAAAA CCCACCTTTT TAGAATCAAT 6060 

20 TTCGACAATG ATTGTAATGG TTATTGTTGT TGTAACAGGC TTTGTGTTTT TTCATATTCC 6120 

AATTCAAGTA TTATTAATTA TTGCCTCAGC ATATGCCACA TGGATTGCAA AACGTGTAGG 6180 

CTTAACATGG CAAGATTTAG AAAAAGGCAT TGCAGAACGT TTAAATACTC CAATGCCTGC 6240 

2S AATTTTAATT ATACTAGOGG TAGGAATTAT AGTAGGCAGT TGGATGTTTT CTGGCACAGT 6300 

GCCA6CCTTG ATTTATTATG GCTTAGATTT ATTGAATCCA AGCTATTTTT TAATATCAGC 6360 

CTTTTTTATA AGTGCTGTTA CATCTGTAGC AACTGGTACA GCATGGGGCT CTGCATCAAC 6420 

30 

TGCAGGGATT GCACTTATTT CTATTGGTAA TCAATTGGGG ATTCCTCCAG GGATGGCAGC 6480 

GGGTGCTATT ATAGCAGGGG CTGTGTTTGG CGATAAAATG TCACCATTAT CAGATACAAC 6540 

TAATTTAGCG GCGCTTGTTA CTAAAGTTAA TATATTTAAA CATATACATT CGATGATCTG 6600 

35 

GACGACGATA CCTGCATCAA TCATAGGTTT ATTAGTATGG TTTATTGCTG GATTTCAATT 6660 

TAAAGGGCAT TCAAATGATA AACAGATTCA AACTTTGTTA TCAGAGCTTG CACAGATTTA 6720 

^ TCAAATTAAC ATATGGGTCT GGGTTCCCTT AATTGTGATC ATTGTTTGTT TGCTATTTAA 6780 

AATGGCTACA GTGCCAGCTA TGCTAATATC AAGCTTTTCT GCCATTATAG TGGGGACTTT 6840 

TAATCATCAT TTCAAAATGA CAGATGGTTT CAAAGCAACA TTTAGTGGTT TTAACGAATC 6900 

45 AATGATACAT CAGTCTCATA TTTCATCCAG TGTGAAAAGC TTGTTAGAAC AG6GTGGTAT 6960 

GATGAGTATG ACCCAAATAT TAGTAACGAT ATTTTGCGGA TATGCATTTG CAGGTATTGT 7020 

AGAAAAAGCA GGATGTTTAG AAGTCTTATT AACTACTATT TCTAAAGGCA TCCATTCTGT 7080 

SO 

AGGAAGTTTA ATATGTATTA CTGTTATTTG TTGTATTGCG CTTGTATTCG CTGCAGGTGT 7140 

TGCTTCGATT GTAATTATTA TGGTCGGTGT GTTAATGAAA GATTTGTTCG AAAAATACCA 7200 
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AATACCATGG GGAACATCAG GTATTTACTA TACGAATCAA CTTCATGTCT CTGTTQAAGA 7320 

ATTTTTCATA TGGACAGTAC CATGTTATTT ATGCGCAATT ATAGCAATTA TCTATGGTTT 7380 

TACAGGGATA GGTATTAAAA AGTCATCGAA TTCACGTTTA ACTTAATGTG AGCGTGGAAT 7440 

ATATATAATA TGTTGAAACA CTTTAATCAT TTATAATTGT AGCGGTTATA ATTTGAAAAG 7500 

GTTTTAACTT AGAATAAATA TCCTCTATGC ATATACTGAA TATGTTTTGT AGCGGAACAT 7560 

GTTGATATAT GTAATGTAAG TTTTATGTCA TGATTTGTAA TGACTAAATT AATTGAGAAT 7620 

TTGAAGGCAA GTATATTTGT AAGTACTTTA ACTAAAAATT TATCAATGTA TAGCCGATTT 7680 

GACATGCCTA AATTTGGGTG TGTCAATGGC TGTATGTTGT TTATTCTTTA TTACAGAGTG 7740 

AATCGGATTG GTGAAAATCG AAATTTTGAG ATTTTTACCA ATTCQATTTT TTTCATAGAA 7800 

ATTAAAAAAG CCAACAAGGC TCTTGAAACC TTGTTGGCGT AAACATAGCC ATCACTAATT 7860 

20 AGTGAATGAA GTTATAACCA GCAGCTTGGC TAGCTGAGAT TGTACGTGAA GTTACAACAC 7920 

CTGGGCCATA ACCATAGTTC ATTTCTGAAA CTCTTACTGA ACCATTGCTG TTAACACTTT 7980 

CAACGTATGC AACGTGACCG TATGCACCTT GAGTTGTTTG CATAATTCCA CCAGCTTTTG 804 0 

GTGTATTGTT CACTGTGTAA CCAGCTCTTG CAGCTGCGTT AGCCCAGTTA CTTGCATTGC BlOO 

CCCAAGTTGA ACCGATTTTA CCACCTACAC GATCAAATAC GTAGTATGTA CATTGACCAG 8160 

AAGTGTATAA GTTACGTCCT GAAGTATAAC CACTTGAGAT TGAACGGCCA TTTGATGATG 8220 

GAGCCATAGT TGTAGTTACT TGAACATTGT TGCTTGAAGT GCTGTAGCTT GCACCTAAAC 8280 

CACCAGTACG GTAGCTGTTT GTGTTGTAAC TATTATAGTT ATTGTAGTTA TATCATTGAT 8340 

TATTATTTGA GTAGTTGTTG TAACGGCTGT AGTTATTGTA GCTATAACCG TTGTTGTAAT 8400 

TGTTATAGTT ATTGTAACCA ITGTAGTAGT AATAGCTGTA GTAGCCATTA TCTTGGTTTA 8460 

ATTGACTTGG ATGCXIAGTTA CCTTTCCATG TGTAATGGTA GTTACCTTGT GCATCAATAG 8520 

TGTAAGTATA GCTATATGAT GTTGGGTCGT TTGGATTATA ACCGTAGTTA TCTTGCTCAG 8580 

AAGCATGAGC TTGATTTCCT GATGCAATTG CGATTGTAGC GAATCCTGCA GTTGCGATAG 8640 

TAGCTGTAGC GATTTTCTTC ATTTTAAAAA TATCCTCCTA AAAATTTTAA ATCTAAAATA 8700 

45 TTTTCGTAAT GTCCGTGTGA CAAAATTAAT GTTATAAGTT ATCTCTCGTA ATTAAACGAC 8760 

AAGAAAGACT ATAACAGAAA TTAGCGTCCT TGTGTGCTTT GTTAACGTTT TGTAATTTTT 8820 

TGCTAATATC TTGACACAAT AGAATTTTAA AAGTATAGAA ATTTGCATTT TGCAAAACTT 8880 

SO 

ATAACTACGG CATTCTTTGT GAAAACTGAA TGTTTCGAAA ATAAGTCTGT TACAAATTTG 8940 

TAATATTACT GAAAATTCTA AATGTATATT TTGTGCATAA TATAGGACTT TTAATCAGAA 9000 

SS 



2S 



30 



35 



40 
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GGATGAAAAT GTATATTTAA TGGATAAAAT ATCCTAATTT AGCATAAAAA AATGTTTTAA 9120 
TAAAAGTATT ATTTGATATA ATCGATTTAT GTTTTGTTAC TGCTAAAAAA CATGTGGCG 9179 
(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

IS 

CCTTCAGCCA TTTGACTTCG ACATGAGTT6 CCTGTACATA TAAAATAAAT TGTTTTTTTA 60 

GTCATAACAA TCTCCTAATT AATTAAAATA TGATAAGTGT TAGATACAAC CCTATGAGGG 120 

TTATAAATAG TACTGGAATT GTAATGATGA TACCAGTTTT AAAGTATGTG CCCCAAGAAA 180 

TCTTAACATC TTTTTGtGTT AAGACGTGTA ACCACAGTAA TGTAGCTAAA GAGCCTATCG 240 

GTGTAATTTT TGGACCTAAA TCAGAACCGA TAACATTCGC ATAAATTAGG CCTTCTTTTA 300 

2S ACATGCCATG GACATTTGAT TGACCAATAG CAATCGCATC TATTAAAACT 6TAGGCATAT 360 

TATTCATTAT TGATGATAAA AACGCTGAAA TGAAGCCCAT TCCCAAAATA GTGCTAAATA 420 

GACCGTAATT GGAAATATAT TCTAATATTT TAGCCAATAT TAAAGTAATG CCAGCATTTC 4 80 

30 

TTAAGCCGAA TACGACGATA TACATACCAA TTGAAAATAA TACTATATTC CAAGGTGC6C 54 0 

CCTTAATGAC TTGCTTAATA TTTACAGCAT TTGATTTACG AGCCAACATT AGAAAAATAA 600 

AAGCAATGAT TCCAGTGAAA ATTGATACCG GAATTTTAGT AAATTTACTG ATTAGATAGC 660 

3S 

CGAAAAGTAA TATAACTAGA ACAATCCaTG AAATTTTAAA TAGCTTTAAA TCATTAATGG 720 

CATCCTTAGG ATGCTTTATA TTATTATCAT CAAACGTTTT AGGTATCGCT TTTCTAAAAT 780 

ATAACCACAA TACTATAATA CTTGCTAAAA GCGAGAATAA ATTAGGTATA ATCATTCTAC 840 

TAAAATATCG AACGAATCCT ACATGAAAAT AATCAGCAGA TATAATATTC ACTAGATTGC 900 

TCACGATTAA AG6TAAAGAA GTTGTGTCAG CTATAAAACC ACTCGCAATA ATnAAAGGGA 960 

4S ATATGGCCCG CTTACTAAAA CCTATATTTT TAACCATCGC TAATACAATA GGCGTTAAGA 1020 

TTAAcGTGCG CCATCATTTG CGAAAAATGC AGCAACAATG GCACCCAATA ATATGATATA 1080 

AAOGAACATT TTTAAACCAT TGCCTTTTGA AGCATGAAGC ATGTGAATAG CTGACCATTC 1140 

GAATAATCCA ACTTTATCTA ATATTAATGA AATAAGAATG ACTGAGACAA AAGTCAAAGT 1200 

AGCATTCCAA ACAATACCTG TTACTTCGAA AACATCGGAA AAACTTACAA CACCAGTAAT 1260 

SS 



599 



EP 0 786 519 A2 

TAATACAAAT AATAAAGTTA CTAGAAAAAT GAGTGTCGCT AAAGTTGTCA TCATTAGCAT 1380 

TCACCAGTCT TAAGGTTATG ACAAATACAT CGTTGGTTAG AGGTATGAAC CTTAGACAAG 1440 

^ TTATTAATTA CGGACTCAAA AATATTATGA TTgAGCTGGT ATAAATGTTT ATTTCCGATT 1500 

TTTCGTGTCG TAACTAAGTT GGTTTTTACT AATGCTTTCA TATGrTAGCT AAGTGTAGGT 1560 

TGAGAGAATT GAAAATGTGC TAACAAATCA CAAGCGCATA ACTCTCCACA AGAAAGTAAA 1620 

10 

TCTAGTATTT CTAATCTGCT TGAATCTGAT AAAACTTTTA AAAATGTTGC TAGTTCTTTA 1680 

TACGTCATAA CATACCTCCT AGACXHTAAA TAGATTATCA TCTATATAGA TGAATGTCTA 1740 

TGTTCCTTTG 6TATATTACA CGATATGACT ATGTAATTTA AATTTGGTTT TAGTATTAAA 1800 

AGGGTATTAA AGATAAATTA TAGATATTGA TTTTGCAAAA TATACTCTTT GTTCTGCATT IB 60 

GAAAAAGG 1S68 
20 (2) INFORMATION FOR SEQ ID NO: X02 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

30 

ATTTATGAAA TCCATAGCnA TAAACATTAT TCTTGCATCG GCTATACAAA CAGTTACCGC 60 

AAGCAAATTT GTATATCAAC CTGGAATTGT GTTCACGTCA ATGGCaAATG CCGATGAT6T 120 

GTTATCAGGC GATAGTTATT TTATGGCTGA ATTAAAATCT ATTAAGCGTA TTGTTGAAAT 180 

35 

TCCAGATAAT CAAAAAATAT ACTGCTTTAT AGATGAAATT TTTAAAGGTA CCAACACAAC 240 

TGAACGAATT GCCGCTTCAG AATCAGTACT ATCATTTTTA CATGAAAAAT CTAACTTTAG 300 

AGTTATTGCA GCAACACATG ATATTGAGTT AGCTGAACTC TTAAAACAAC GTTATGAAAA 360 

TTACCATTTC AATGAGGTAA TAGAAAATAA TAACATACAT TTTGATTACA AAATTAAGCC 420 

TGGCAAAGCA AATACACGTA ATGCCATCGA ATTATTAAAA ATCACTTCAT TTCCAGCAAA 480 

45 AATATATGAA CGAGCAAAAG ATAATGTCCC GAAAATTTAG CATTTAACTT TAAACATAAA 540 

AACGTCAGCT ATCACATGAC AGAAGACTAT GAACAGTTTC AATAATGTTC ATAGTAATCA 600 

TGTTAATAAC TGACGTTTAT TTTATTCTGC AGAATACTCT TCTAAATCTA TATTGCTGTG 660 

SO 

CCCATTTAAT GCTAAATCAG CAAATCGACC TTGCTGATAC AAATAGTGGC CGGCAACGCC 720 

TATCATTGCA GCATTATCTG TGCATAATTT AGGACTTGGG ATAGTTAATT GAATGTCATT 780 

55 
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AACAATTAAT CGCTGAACAC CATATTCTTT ACAAGCTTGA ATAGCTTTAA ACGTGAGCAC 900 

CTCTACAACA CTGTTTTGAA AGCTCGTTGC TACGTTAGCT TCAATGATTG GaATATTTTT 960 

TTGTCGTTGA TTGTGAAGTT GATTGATTAC GGCACTTTTC AACCCACTAA AACTAAAATC 1020 

ATAACTATCT TTATCCAACC AAACACGAGG GAATGAATAA GTATCTTCAC CTTCAGCAGC 1080 

CAACCGATCA ACTTGTGGAC CACCTGGATA ATTTAAACCA ATTGTTCGTG CCACTTTATC 1140 

ATAAGCCTCA CCTACTGCGT CATCTCX5TGT TTCACCAATG ACTTCAAATG ATAAATGATC 1200 

CTTCATATAA ACTAATTCAG TATGTCCACC TGAAACAATA AGTGCAATTA GCGGGAATGT 1260 

TAATGGCTCT TCTATGTGAT TAGCATATAT ATGTCCTOCA ATATGATGAA CAGGAATAAG 1320 

TGGCTnATCG TAAGCAAATG CCAATGCTTT GGCTGCATTA ACACCTATTA GTAACGCACC 1380 

AATTAGTCCA GGGCCTTCTG TAACCGCTAT GGCATCAATA TCTTCTATTG ATACATCGGC 1440 

20 ATCCCCTAGA GCCTCGTTTA TTGTTGCTGT TATACCTTCA ACGTGATGTC TACTTGCCAC 1500 

TTCGGGAACG ACACCGCCAA ATCGTTTATG ACTTTCAATC TGACTTAAAA CTGTATTTGA 1560 

TAAAATATCT CTGCCATTTT TTATAACACT AACGCTTGTT TCATCACAAC TTGTTTcAAC 1620 

AGCTAGTATT AATATATCTT TAGTCATTTA AATTCACCCA CATAACCATT GCGTCCTCAC 1680 

CTTCACCATA ATAATTTTTA CGTTTACCAC CATATTGAAA TCCTAAATTT TCATATACAT 1740 

GTTGTGCCAC TTTATTATTA ACTCTTACTT CTAAACTCAT CACATCACAA GTGTGACTTG 1800 

CATAGTTTAT TCCGTATTTT AAAAGCATTT GACCTAAACC ATAGCCTCTA TAATTATCAT 1860 

COATTGCAAC TGTTGTAATT TGAGCTTGAT CGATAACAAT CCATAAACCT AAATAACCAA 1920 

TAATTTGTTG TTCAAATTCt AAGACAAAAT ATTTCGCAAA GTTATTTTGC TCTATTTCAT 1980 

GATAAAATGC GTCAATTGTC CAAGAACTGT CATTGAAACT CCGACGCTCA AGATCAAAQA 2040 

CTTqTGGCAC ATCTTCTTTA GTCATCTCTC TAATGTTTAA TTGTTCTTTT GACTGTTGAT 2100 

40 CCAATTTCGT TCCGCCTCAG CTAATTTATG GTATTTAGGA GTAAATGTAT GTACGTCTGA 2160 

AGGTTTATCT AGCAATTGAT ACATGACTGA TGCATTTGGT AGctGCGCAA TCACTTCACC 2220 

TTGTAATTCA TCTTGTAATT TTACAGTATC TTTCCCAATA TAAATAAATG GTTGGTTTAA 2280 

ATCTTCTAAA AAAGCTCGCA ATGCCTCTAT CGACATATAT TGATCTTCTA AAATAGTCAC 234 0 

TAATTGACCA TTTTGCCACT GGAATATGCC TGTATAAACT GCTTGTCX3TC TTGCATCAAA 2400 

CACAGGAACC AATAATTTAT CAGTATGATC GATTGTTGCT GCCAATGCCT TTAATGATGA 2460 

AACACCATAT AATTTAACAT CTAACGCATA CGCTAATGTT TTAGCAACAG TAACACCGAT 2520 

ACGTAAGCCA GTATATGAAC CAGGACCTTC AGCAACAATA ATCGCATCTA ATTGCTGTTT 2580 
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TTGTTTAGAA TCCGTAGTTA TTTCAGCTAA AACTTCATCG TTTTGCATCA ATGCTACTGA 2700 
TAATGGTTGA TTCGATGTAT CAATGAGCAG CGAATTCATG GATAATTGCC TCCTTAATTT 2760 

GTTCATAATG TTCTCCTTGC GCGAACAACT CAATTTGTCT TGTATTTTCA GATATTGTTG 2 820 

AAATGTTAAT AGATAAATGC GTCGCTGGAA GTAAATCTTT TATAAATTGA CTCCATTCAA 2880 

TAACAGTAAT TGCCTGATCT TCGAAAAATT CATCAAATCC TAAATCTTCA TCAGAATCTT 2940 

CTAAGCGATA ACAATCCATA TGATGCAATT TTAAATTTTT ACCCCTATAT GATTTAATGA 3000 

TGTTAAATGT CGGGGAATTA ATCGTACGTC TTACACCAAG AGCTTTTCCT ATAAATTGCG 3060 

TTAACGTTGT TTTACCTGCT CCTAAATCTC CGTTAAGTAA AATCAAATCA OyVCTTTTCA 3120 

ATTGCTCAAC TAAAAATATA GCAAATTGAT TCATTTCATC TAAATTATTT ATCTTTATCA 3180 

ATGTTGATTC TCCTATATTA TGCTTTTCAT TCATAAAAAT GATTATCCAT TGTTCAATCG 3240 

20 TATCTAACTT TATATTTAAC CTTTATATTG TAACAAATTT CAACTTAAAT TTCTTATCTT 3300 

TGAAACAGAT TATCTATTCA AAGTTAATTG TAAGAAAATT TAAAATATTT GTTGACATAC 3360 

TAAAGCAGAT ATAGTAAATT AAATTTATCA AATTTTTAGA CAATTCTAAC TATTAAAGTG 3420 

ATATATACCA TTCACGGAAG GAGTATAATA AAATGCTTAA TCAATATACT GAACATCAAC 3480 

CGACAACTTC AAATATTATT ATTTTATTAT ACTCTTTAGG ACTCGAACGT TAgTAAATAT 3 54 0 

TTACTAAACG CTTTAAGTCC TATTTCTGTT TGAATGGGAC TTGTAAACGT CCCAATAATA 3600 

TTGGGACGTT TTTTTATGTT TTATCTTTCA ATTACTTATT TTTATTACTA TAAAACATGA 3660 

TTAATCATTA AAATTTACGG GGGAATTTAC TATGCGAaCG AgcATGATCA AAAAAGGAGA 3720 

TCACCAAGCA CCAGCAAGAA GTCTTTTACA TGCCACGGGC GCGCTAAAAA GTCCAACTGA 3780 

TATQAACAAA CCATTTGTAG CTATTTGTAA CTCTTATATT GATATTGTTC CTGGACATGT 3840 

TCAqTTGAGA GAGCTTGCAG ATATAGCTAA AGAAGCAATT AGAGAAGCCG GTGCCATTCC 3900 

40 ATTTGAATTC AATACAATTG GTGTTGATGA TGGAATAGCT ATGGGACATA TCGGAATGCG 3960 

ATATTCTCTA CCATCACGTG AAATTATTGC AGATGCAGCT GAAACTGTAA TTAACGCTCA 4020 

TTGGTTTGAC GGCGTATTTT ACATTCCTAA TTGTGACAAG ATTACACCCG GTATGATTTT 4080 

AGCAGCCATG AGGACAAACG TACCAGCTAT CTTTTGCTCT GGTGGACCAA TGAAAGCTGG 414 0 

CTTATCTGCA CATGGAAAAG CATTAACACT TTCATCAATG TTTGAAGCAG TCGGCGCATT 4200 

TAAAGAAGGA TCGATTTCTA AAGAAGAATT TTTAGATATG GAACAAAATG CCTGCCCTAC 4260 

TTGTGGTTCA TGTGCTGGGA TGTTTACTGC AAATTCAATG AACTGTTTGA TGGAAGTTTT 4320 

AGGTCTAGCA TTACCATACA ACGGTACTGC ACTT6CAGTC AGTGATCAGC GACGAGAAAT 4380 
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TATCGTTACT CGCGAAgCAA TTGATGATGC 
AACAAACACG GTACTGCATA CGTTAGCCAT 
^ AGAGCGCATT AATGCTATTG CCAAACX3CAC 

ATCGTATTCA ATGCATGATG TGCATGAAGC 
GATGAAGAAA GATGGCACGT TACACCCAGA 

10 

TGAAAATAAC GAAGGCAAAG AAATTAAGAA 
ATATGATGCA CAAGGCGGTT TATCTATCTT 
TATTAAAGTT GGCX3GCGTTG ATCCATCTAT 
CAATTCX3CAT GATGAAGCTG TTGAAGCAAT 
CX3TTGTCATT AGATATGAAG GACCTAAAGG 
20 TACTTCCTCT ATTGTTGGTC GCGGCTTAGG 

TTTTTCCGGT GCCACAAGAG GTATTGCAGT 
TGGACCAATT GCCTTAATTG AAGATGGTGA 
ATTAAACGTA AACCAGCCTG AAGATGTTCT 
TAAAGCGAAA GTAAAAACAG GTTATCTAGC 
TACAGGTGGG GTCATGCAAG TCCCTGAGAA 

30 

GGTTAAAATG TCTAAAACTC AACATGAAGT 
TGAATCACTT GAACCTGAAC AACTAAATGA 
AGAAGTGCTA GTAGAAGCTC TACTTAAAGA 

35 

TGGTGCOGTA CTACCTTTAT ATGACACGTT 
AAGA^CGAA CAAGGTGCTG TTCATGCTGC 

40 GGCGTCGTTG TAGTTACAAG CGGTCCaGGT 
GCACATTGCG ACTCTTTACC TCTAGTTGTA 
GGTAAAGATG CATTCCAAGA AGCGGATATT 

^ AATTATCAAG TGAAACGTGT TGAAGATATC 
GCTAATTCTG GACGCAAAGG TCCTGTAGTG 
GCTACAAATG TGGATTTATG CGACGAAATC 

SO 

CCAGAAAATA AAGACATTGA CACTTTCATC 
GTATTAGCCG GCGCAGGTAT TAATCAATCA 

SS 



ATTTGCACTT GATATGGCTA TGGGTGGTTC 4500 

TGCCAATGAA GCTGGTATTG ATTATGACTT 4560 

GCCATATTTA TCAAAAATAG CACCTAGTTC 4620 

TGGTGGCGTC CCAGCAATTA TTAATGAATT 4680 

TAGAATCACA GTTACTGGCA AAACGTTACG 4740 
CTTTGATGTC ATTCACCCTC TTGATGCACC . 4800 

ATTTGGTAAT ATCGCCCCTA AAGGCX3CAGT 4860 

CAAAACATTT ACTGGGAAAG CAATTTGTTT 4920 

AGACAATCGT ACCGTTCGTG CAGGCCACGT 4980 

TGGACCAGGT ATGCCTGAAA TGTTAGCACC 5040 

TAAAGATGTT GCATTAATTA CTGATGGGCG 5100 

TGGTCATATT TCCCCTGAAG CTGCATCTGG 5160 

TGAGATTACT ATTGATTTAA CAAATCGTAC 5220 

AGCGCGTCGC CGAGAATCTT TAACACCATT 5280 

TCGTTATACT GCCCTAGTAA CTAGCGCAAA 5340 

TTTAATTTAA TTTATTTTTA TATTGGAGAT 5400 

AAACCAAAAT ATTGACCCTT TAAAAATGGC 5460 

AAAAACTTTA AATGATATGC GTTCAGGATC 5520 

AAATGTGGAT TATTTATTCG GTTATCCTGG 5580 

TTATGATGGT AAAATCAAAC ATATTTTAGC 5640 

AGAAGGTTAT GCACGTGTAT CTGGTAAamT 5700 

GCAACTAATG TAATGACAGG TATTACGGAT 5760 

TTCACTGGAC AAGTTGCTAC ACCAGGCATT 5820 

CTATCTATGA CTTCACCAAT TACAAAACAA 5880 

CCTAAAATCG TACACGAAGC TTTCCATGTA 5940 

ATTGATTTTC CAAAAGATAT GGGTGTTTTA 6000 

AATATTCCAG GTTATGAAGT TGTTACAGAA 6060 

TCACTTTTAA AAGAAGCGAA AAAGCCTGTC 6120 

AAATCAAATC AATTATTAAC ACAGTTTGTT 6180 
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GATACACTAT TTTTAGGTAT GGGAGGAATG CATGGTTCTT ATGCTAGTAA CATGGCATTA 6300 

ACTGAGTGTG ATTTACTCAT TAATTTAGGT AGCCGCTTCG ATGATAGATT AGCAAGCAAA 63 60 

CCTGATGCCT TTGCACCTAA CGCCAAAATT GTACATGTAG ATATTGATCC TTCAGAAATC 6420 

AATAAAGTTA TTCATGTAGA TTTAGGTATT ATTGCAGACT GTAAAAGATT TTTAGAATGT 6480 

TTAAATGATA AAAATGTTGA GACTATAGAA CACAGTGACT GGGTTAAACA TTGTCAAAAT 6540 

AATAAGCAGA AACACCCATT TAAACTTGGT GAAGAAGATC AAGTATTTTG TAAGCCACAA 6600 

CAAACAATCG AATATATCGG CAAAATTACA AATGGTGAAG CAATTGTTAC TACAGACGTG 6660 

GGACAACATC AAATGTGGGC AGCTCAAm TATCCATTTA AAAATCACGG ACAATGGGTT 6720 

ACAAGCXMTG 6TTTAGGAAC AATGGGATTC GGTATTCCTT CGTCAATTGG T6CCAAATTA 6780 

GCTAATCCTG ATAAAACAGT CGTATGTTTC GTCGGTGACG GTGGTTTCCA AATGACAAAC 6840 

20 CAAGAAATGG CACTTTTACC CGAATATGGT TTAGATGTCA AAATCGTACT AATCAATAAT 6900 

GGAACATTAG GTATGGTTAA ACAATGGCAA GATAAGTTCT TTAATCAACG CTTCTCACAC 6960 

TCAGTATTTA ATGGTCAACC TGATTTTATG AAAATGGCAG AAGCATATGG CGTCAAAGGT 7020 

TTCTTAATCG ATAAGCCAGA ACAACTGGAA GAACAATTAG ATGCAGCGTT TGCTTATCAA 7080 

GGACCAGCTT TAATTGAGGT TCGTATTTCC CCTACTGAAG CTGTAACCCC AATGGTTCCG 714 0 

AGTGGCAAAT CAAATCATGA AATGGAGGGC TTATAATGAC AAGAATTCTT AAATTACAAG 72 00 

TTGCGGATCA AGTCAGCACG CTAAATCGAA TTACAAGTGC TTTTGTTCGC CTACAATATA 7260 

ATATCGATAC ATTACATGTt ACACATTCTG AACAACCTGG GATTTCTAAC ATGGAAATTC 7320 

AAGTCGATAT TCAAGATGAT ACATCACTTC ATATATTAAT TAAAAAATTA AAACAACAAA 7380 

TTAATGTTTT AACGGTTGAA TGCTACGACC TTGTTGATAA CGAAGCTTAA TTTTAAGACA 7440 

AAGGfiAATGA TGCGCTAATT AGTTATAGAT ATATCATAGG CTGCTAGTTA ACATCTGCCA 7500 

CTATTACAAA GTTATATTTC AGAATTTTCG AAACACAAAA TATTTAATTA TTTGGAGGAA 7560 

TTTATTATGA CAACAGTTTA TTATGATCAA GATGTAAAAA CGGACX3CTTT ACAAGGCAAA 7620 

AAAATTGCAG TAGTAGGTTA TGGATCACAA GGTCACGCGC ATGCACAAAA CTTAAAAGAC 7680 

45 AATGGATATG ATGTAGTCAT CGGCATTCGC CCAGGTCGTT CTTTTGACAA AGCTAAAGAA 7740 

GATGGATTTG ATGTGTTCCC TGTTGCAGAA GCAGTTAAGC AAGCTGATGT AATTATGGTG 7800 

CTATTACCTG ATGAAATTCA AGGTGATGTA TACAAAAACG AAATTGAACC AAATTTAGAA 7860 

AAACATAATG CGCTTGCATT TGCTCATGGC TTTAACATTC ATTTTGGTGT TATTCAACCA 7920 

CCAGCTGATG TTGATGTATT TTTAGTAGCT CCTAAAGGAC CGGGTCATTT AGTTAGACGT 7980 
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CAAGCACGTA ATATTGCTTT AAGTTATGCA AAAGGTATTG GTGCAaCTCG TGCAGGTGTT 8100 

ATTGAAACAA CATTTAAAGA AGAAACTGAG ACAGATTTAT TTGGTGAACA AGCAGTACTT 8160 

TGCGGTGGTG TATCGAAATT AATTCAAAGT GGCTTTGAAA CATTAGTAGA AGCGGGTTAT 8220 

CAACCAGAAT TAGCTTATTT TGAAGTATTA CATGAAATGA AATTAATCGT TGATTTGATG 8280 

TATGAAGGCG GTATGGAAAA TGTACGTTAC TCAATTTCAA ATACTGCTGA ATTTGGTGAC 8340 

TATGTTTCAG GACCACGTGT TATCACACCA GATGTTAAAG AAAATATGAA AGCTGTATTA 8400 

ACTGATATCC AAAATGGTAA CTTCAGTAAT CGCTTTATCG AAGACAATAA AAATGGATTC 8460 

AAAGAATTTT ATAAATTACG CGAAGAACAA CATGGTCATC AAATTGAAAA AGTTGGTCGT 8520 

GAATTACGCG AAATGATGCC TTTTATTAAA TCTAAAAGCA TTGAAAAATA AGATAGACCT 8580 

ACAATGAGGA GTTGTTAAAT ATGAGTAGTC ATATTCAAAT TTTTGATACG ACACTAAGAG 8640 

20 ACGGTGaACA AACACCAGGA GT6AATTTTA CTTTTGATGA ACGCTTGCGT ATTGCATTGC 8700 

AATTAQAAAA ATGGGGTGTA GATGTTATTG AAGCTGGATT TCCTGCTTCA AGTACAGGTA 8760 

GCTTTAAATC TGTTCAAGCA ATTGCACAAA CATTAACAAC AACGGCTGTA TGTGGTTTAG 8820 

CTAGATGTAA AAAATCTGAC ATCGATGCTG TATATGAAGC AACAAAAGAT GCAGCGAAgC 8880 

CGGTcGTGCA TGTTTTTATA GCAACATCAC CTATTCATCT TGAACATAAA CTTAAAATGT 8940 

CTCAAGAAGA CGTTTTAGCA TCTATTAAAG AACATGTCAC ATACGCGAAA CAATTATTTG 9000 

ACGTTGTTCA ATTTTCACCT GAAGATGCAA CGCGTACTGA ATTACCATTC TTAGTGAAAT 9060 

GTGTACAAAC TGCCGTTGAC GCTGGAGCTA CAGTTATTAA TATTCCTGAT ACAGTCGGCT 9120 

ACAGTTACCA TGATGAATAT GCACATATTT TCAAAACCTT AACAGAATCT GTAACATCTT 9180 

CAAAT6AAAT TATTTATAGT GCTCATTGCC ATGACGATTT AGGAATGGCT GTTTCAAATA 9240 

GTTT5GCTGC AATTGAAGGC GGTGCGAGAC GAATTGAAGG CACTGTAAAT GGTATTGGTG 9300 

AACGAGCAGG TAATGCAGCA CTTGAAGAAG TCGCGCTTGC ACTATACGTT CGAAATGATC 9360 

ATTATGGTGC TCAAACTGCT CTTAATCTCG AAGAAACTAA AAAAACATCG GATTTAATTT 9420 

CAAGATATGC AGGTATTCX3A GTGCCTAGAA ATAAAGCAAT TGTTGGCCAA AATGCATTTA 9480 

45 GTCATGAATC AGGTArTCAC CAAGATGGCG TATTAAAACA TCGTGAAACA TATGAAATTA 9540 

TGACACCTCA ACTTGTTGGT GTAAGCACGA CTGAACTTCC ATTAGGAAAA TTATCTGGTA 9600 

AACACGCCTT CTCAGAGAAG TTAAAAGCAT TAGGTTATGA CATTGATAAA GAAGCGCAAA 9660 

SO 

TAGATTTATT TAAACAATTC AAGGCCATTG CGGACAAAAA GAAATCTGTT TCAGATAGAG 9720 

ATATTCATGC GATTATTCAA GGTTCTGAGC ATGAGCATCA AGCACTTTAT AAATTGGAAA 9780 
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AAGAGGGTCA 
ATGCAGTTGA 
TCACTGAAGG 
CTGTCAATGG 
AAGCACATGC 
TAACATTGTT 
ATTGCTTGAA 
TGGTGGTGCC 
TAAAAGAGCA 
CAATCGACCA 
ACGCCCCACT 
TGAAGGCACA 
TAGACATTTT 
ACGCATTGTT 
TGATAAAGAA 
TCAATTATAT 
AATCACAAAT 
AAGTGATGAA 
TAACX5ATGGT 
AAACGTTGCC 
AAATCAACCA 
GCAAACGACA 
TCAAAAATTG 
GGAACAGACA 
ACCTTATACA 
AATTAAGACG 
TATTCAATAT 
TAGATTTTGG 
TAGGACCTGA 



TATTTACCAG 
TCGTATTTTC 
TACTGATGCC 
CTTTGGTATT 
TAAATTTGCA 
GCCCTACCTG 
ATTATAAGTA 
TCTATTGATA 
GATGCTATTT 
GAACAAGGAT 
ACCGTTGTCA 
GATTTAGTTA 
AATAATCACG 
CACGTAGCAT 
AATGTATTAG 
CCAGAAGTAA 
CCAAAACAAT 
GCTTCAGTGA 
CCAAGATTGT 
AATCCATTTG 
GATGCTGCAG 
GCAGATTTAG 
AATCACTAAG 
TGTGTTATAC 
TGAAGTTACT 
CCCAGATTTA 
TAAAGATGAA 
GGTGCATATT 
GACAGGACTT 



GATTCAAGTA 
CAGAAAGAAA 
CAAGCAGAAG 
GATCATGATA 
GCTGAAAATG 
GTGATGGAAT 
ATAAATATAA 
CATTCGGCXSA 
TACTGGGTGC 
TATTAAAATT 
AAGGCGCTAG 
TAGTCCGTGA 
AGGCCTTAGA 
TTAAATTGGC 
CTTCTAGTAA 
CAGTAAATCA 
TTGACGTCAT 
TTCCTGGTTC 
ATGAGCCTAT 
GAATGATTCT 
ATGAATTAGA 
GCG6CAAATT 
GGGGAGATGT 
GGGAAATTGG 
TCTCCTCAAG 
ACATTTGCAA 
ATTGCAAACA 
TTTGATATGG 
ACACAGCCTG 



TTGGTACTGG 
CAGAATTAAT 
TACATGTAAA 
TTTTACAAGC 
TTGAGAAGGT 
CGGTCCAGAA 
CTTTAATTAT 
GCCTTTAACT 
AATCX3GTGGA 
GCGTAAATCC 
TTCTTTATCA 
ATTGACAAGT 
TTCTCTTACT 
CGCTTCAAGA 
ATTGTGGCGC 
CTTATTTGTT 
CGTATGTGAA 
ACTTGGTTTA 
TCATGGATCA 
ATCTTTAGCG 
ACAACATATT 
GAATACTACT 
AAATGGGTCA 
GCGAACCGCA 
CATTTGAAGG 
CACTCGATCA 
AACAAATCAC 
GTTCTGATGA 
GCAAGACAAT 



TTCAATCGTA 
TGATTATCGT 
TTTATTGATT 
CTCTTGTAAA 
AGGTAATTAA 
ATTTTGAACG 
CAAATAGAGC 
GAGAAAACCT 
CCTAAATGGA 
TTAAATTTAT 
CCTTTAAAGG 
GGTATTTATT 
TATACAAGAG 
CGAGGAAAAC 
AAAGTCGTAA 
GATGCTTGTA 
AACTTATTTG 
TCACCTTCTG 
GCACCAGATA 
ATGTGTTTAC 
TATAGCATGA 
GATATTTTCG 
AACATTATTT 
ACTATTATAC 
ACTTAGGCTT 
CAATGTTCCT 
AACATTACAA 
ACAAGGTATT 
CGTTTGTGGT 



GCAATTTACA 

ATTAATTCTG 

GAAGGTAAGA 

GCATACGTAG 

TTATGACTTA 

GATCTCTATC 

ACCACX»ATT 

TAAATGCGTG 

CAGATCCTAA 

TTGTAAATAT 

AAGAACGCGT 

TTGGAGAACC 

AAGAAATAGA 

TAACATCAGT 

ATGAAGTAAG 

GTATGCATTT 

GCGATATTTT 

CTAGTTTTAG 

TTGCAGGTAA 

GTGAAAGCTT 

TTGAACATGG 

AAATTCTATC 

GACAA6GTGT 

ATT6ATTTAC 

CAAAACAGAA 

ACTATTGATA 

AAAAACGCCA 

GTTCACATGG 

GACTCTCACA 



9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
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ATGTTTTCGC AACTCAAACG CTATGGCAAA CAAAACCCAA AAACTTAAAA ATCGATATTA 11700 

ATGGTACCTT ACCAACAGGC GTCTATGCTA AGGACATTAT TCTGCATTTA ATTAAAACGT 11760 

ATGGTGTTGA CTTTGGTACA GGCTATGCTT TGGAATTTAC TGGCGAAACA ATTAAAAACC 11820 

TTTCAATGGA TGGTCGAATG ACTATTTGTA ACATGGCTAT CGAAGGTGGT GCCAAATACG 118 80 

GCATAATCCA ACCTGATGAT ATAACATTTG AATATGTTAA AGGGAGACCA TTTGCCGATA 11940 

ACTtCGCTAA ATCAGTTGAT AAGTGGCGTG AgCTATATTC TGATGACGAC GCGATATTTG 12000 

ATCGTGTAAT TGAACTTGAT GTTTCAACAT TAGAACCACA AGTGACATGG GGAACTAATC 12060 

CTGAAATGGG TGTTAATTTC AGTGAACCAT TCCCTGAAAT CAATGATATC AACGATCAAC 12120 

GTGCGTATGA TTATATGGGG TTAGAACCAG GTCAAAAAGC TGAAGACATC GACTTAGGGT 12180 

ATGTTTTTCT CX3GTTCATGT ACAAATGCTA GACTATCAGA TTTGATTGAA GCTAGTCATA 12240 

TTGTTAAAGG AAATAAAGTT CATCCAAATA TTACAGCTAT TGTCXrTACCA GGTTCTCX3TA 12300 

CAGTAAAAAA AGAAGCAGAA AAATTAGGTC TAGATACTAT CTTTAAAAAT GCAGGATTTG 12360 

AATGGCGTGA ACCAGGATGT TCAATGTGTT TAGGCATGAA TCCTGACCAA GTACCTOAGG 12420 

GCGTACATTG TGCATCTACA AGTAATCGAA ACTTTGAAGG ACGACAAGGC AAAGGTGCAA 12480 

GAACACATTT AGTATCCCcT GCTATGGCAG CAGCAGCAGC TATTCATGGT AAATTTGTGG 12540 

ACGTAAGAAA GGTGGTTGTT TAAATGGCAG CAATCAAACC TATTACAACA TATAAAGGTA 12600 

AAATAGTCCC TCTCTTCAAC GACAATATCG ATACAGACCA AATCATTCCT AAGGTACACT 12660 

TAAAGCGTAT TTCAAAAAGT GGCTTTGGTC CATTTGCTTT TGATGAATGG CGGTACTTAC 12720 

CTGATGGTTC AGATAATCCT GATTTCAATC CTAACAAACC ACAATATAAA GGGGCTTCTA 12780 

TTTTAATTAC TGGAGATAAT TTTGGATGTG GTTCAAGTCG TGAACATGCT GCTTGGGCTC 12840 

TTAfl^GACTA TGGTTTTCyVT ATTATTATTG CAGGAAGTTT CAGTGACATA TTTTATATGA 12900 

ArrGCACTAA AAATGCGATG TTGCCTATCG TTTTAGAAAA AAGTGCCCGT GAACATCTTG 12960 

CACAATATGT TQAAATTGAG GTCGATTTAC CAAATCAAAC TGTGTCATCA CCAGACAAGC 13020 

GTTTCCATTT TGAAATTGAT GAAACTTGGA AGAATAAACT TGTAAATGGC TTAGATGACA 13080 

TTGCAATCAC CCTACAATAT GAATCATTAA TAGAAAAATA TGAAAAATCa CTTTAAGGGA 13140 

GTTGAATATT ATGACAGTCA AAACAACAGT TTCTACGAAA GATATCX3ATQ AGGCATTTTT 13200 

AAGACTTAAA GATATTGTCA AAGAAACACC TTTACAATTA GACCATTACT TATCTCAAAA 13260 

GTATGATTGT AAAGTCTATT TAAAACGAGA AGATTTACAA TGGGTACX3TT CTTTTAAATT 13320 

AAGAGGTGCT TACAACGCTA TTTCTGTTTT ATCAGATGAA GCTAAAAGTA AAGGTATTAC 13380 
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(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE C31ARACTERISTICS : 

(A) LENGTH: 14051 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15249 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

GTGGCAATAT TTCTAGTTCT CGTTTTGATA AGATTTTAAA AGGATCTGTT GTGTTTGCAG 60 

TGTCCTGATT TGAATTAGAT ACAAATTCAT TCACTAAAGA TGTTGTAAGT TTCATATCTA 120 

CATATGTTTC ACCTTTATAT ACAGTTCGAA TAGCTAACAA TAATTGTTCA TCAGGTGCAT 180 

20 TTTTCAATAT GTAACCTTTC GCACCATTAC GCAACACATG GAACAAATAC TCCTCATCAT 240 

CAAACATTGT TAATATTAGT ATTTTAGTTT CAGGAAAACT GTCAGCAATT TTACTCGTAG 300 

CGATAAGACC TGACTCACCT GGTGGcATAC TTAAATCCAT TAGTAACACA TCAGGTTTAt 360 

2S ATTCCATTAC TTnTGGTAA GCTTCGACGC CATCTGCAGC CGTTGCAACA ACTTCCATAT 420 

CATTTTGATA ATTTAAAATC ATAGAGAACC CCGTACGGAC AACAGCGTGA TCATCGGCAA 480 

TGACTATTTT CAATTTTATT CCCCCAATGT ATGTTTCAAA TTGGAATGTT CAATGTAACA 540 

TTGGTACCCT CACCAATTTT CGTTTCAATA TTGACGCTAC CGCTGACTAA CTCAGCTCGC 600 

TCATTCATTC CATATAAACC GAGTCCAGAA CCTTTAGGCT TAGAACTTGG ATCAAAACCA 660 

TTTCCCGCAT CTATCACTTC TGCTACCAAA TGGCGCCCAG TTTGACGGAT ACCTACATTT 720 

ATTTCATTTA CATCAGCGTA TTTCAACGCA TTTAAAATAG CTT C TT G CAC TACTCGATAA 780 

ACAACCGTTT CAATATCACT ATCAAAGCGA GTATTTTTAA TATTTGATGT ATATATGATT 840 

TTTATTCCAT AATTTTCTTC AAACTGTTTA AAATATGATT TAAAAGCTGC TTCAAGGCCT 900 

AGATCATCCA AAGAAGCGGG TCTTAATTCA ACCGACATAT TACGTATATC ATCAATTAAT 960 

TTAGCGACAA TATATTCAAT ATTTTCTGCG TCTTCCAAAA GCTTAGTTGT ATCTTCTTGA 1020 

45 TATTTTAATA ATCTCAATTG AACATCTACA TTGAGCATTT CTTGAATCAC ACTATCATGT 1080 

AACTCTCTAG AAATTCGCTT TCTTTCATTT TCTTGGGCTG AGATTGTTTT ACGCATCATA 1140 

CGTTGTTGAT GCAATTTCTC TTGCTGTTCA ATTTGTGATG AAACA TTTT G AAGCGTAAAT 1200 

SO 

GCATGAATTC CCCTGTCTTG ATCAATCAAC TGATATGTTG CTGTAAATGG CATCACTTTT 1260 

TGATCTTTCG TCTTCATAAA TACTTGGAAA TTCGTAGCTT GTACTTGCAT CGATTCTAAG 1320 
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ATCGCATTCG CCACAGCACT GTAATTATCT 
TTCATTGCAA TAATTTTACC GTTATCATCA 
^ TAATATTTTT TCAATAAAGT ATCTAACTGT 

CTAATTCATC TCATTATTTA TCATCATTGA 
TATCAAATAT TTTTGGTAAA GGACGACCAT 

10 

CTTGATTCTT ATACCAAAGC GGCACTGCTA 
TTGGATAGTC AATCTTTTCT TCAGGCCCTA 
GCTTTCCTGT TTTCATAACA GTTCCAGCTA 

IS 

TAAATCGATT ATTTTTATTA CCTGAAACAT 
TAQATTCATA GAAAGCGATT GCCGCAAAAT 

20 ATGTCTCTTG AAATCTACGA TCTTCAATTA 
TATGCTTACA CTTTATTCTT ACGGTAAATA 
ACACTCCAAA CATGCACCAA ACGTGTAAAT 

2S AATATATGCA TTTTAAATGC AATCGGCACA 

ATAAATAATT GTCTAAACCA AATTGATAAT 
ATATTTGTTA CTAATGTTGC GTAACATCCC 

^ ACAAATATAT CCGACGCTGA ACTTAATCTT 
GTTAATAAAA ACATCCCTAT CAAAGTTATT 
CCTATATGAT ATAAATGCTC AGACACACCC 

35 

AATCCAACTA CGTGTCCAAA AAACACTGGA 
CACATCAACC TTTTTCTTTC TATTAATTCA 
TAACGATAAC GTGCAATATG ACCTGCGACA 

40 

ACCCATAAAA ACTGATTAAG CATGATGTTT 
CAATGTTTTT CTAAGTGCTT GAATCACATA 
45 ATTCGCCATC ACATATGTTC CATCCTCAAT 
AGCTCTTGGA TCATTTCGCC ATTCTGCCAC 
ATCAGAAAGT TCATTATCTA CCATTTCTAG 
AGCTAACATT TGCCCAOGTT CTTTTTGCGT 
TGCTTTTTTC GTAAAATCAA ATGTATCTGT 
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TCTTCAGATA ATATATCTTT AGCAGCATCA 1440 

GCAAAAACTA TCTTTTCGAT TGAATGCTCA 1500 

ATACTGTCCT CATTAATCAT GACTTACACC 1560 

AAATACCAAA CTTACGTTGA ATATCATCAT 1620 

CTCTTTGACC AAATAATAGT ACGCCATACA 1680 

AAACTGCTGT TAATGATTCG CTCAATAAAA 1740 

AAGCTAAACC AACATTGGCT ATTACCATAC 1800 

ATCCACGACC TTTTCTTAAA ATAATCAATT 1860 

AGTGCCATTT TATTGGAGAT GATGGTTTGT 1920 

CATAACCCTC TTCTTTGCGT ATTTTATCTA 1980 

TTGCTTCTGG TGTCAAATCC TTTCACCTCT 2040 

ATATATCTGC GATTTATATA TGTCAAAGGT 2100 

GGCCAACAAG CCATAATAGT GAAACCTAAC 2160 

CCACTCATCA ATGACGCATC TGGTTTTAAC 2220 

GAAGTTCTGT AGTTAAAGTC TGGATGTTGT 2280 

ATAAATACGA TAAGTAATAA TAAGAAATTT 234 0 

CGAATACTTT TCGTAGTAAC ACXJTCTCGCT 2400 

ATACCAAAGA TGCTACCAAT ATAAACAGCG 2460 

ACTGCATCCA TCCATGGTTT CGGTATTAAC 2520 

ATAATACCTA AGTGAAATAA TAAACTTCCC 2580 

CTAGATTTAG CTGTCCAAGA AAATTTATCA 2640 

AAGACAACTA AACATAAATA CGGAAATATA 2700 

CACTCCTTTT GGTGATGTCA AACATAATTT 2760 

GGCATATGGA TTGTTATCTT CACCAAGTGC 2820 

AATCATAATG ATTAATTGAA TATTCTCTTC 2880 

TTGCAAAAAT TGAAGCATCA ACGGTAGATA 2940 

TCCAAACATT TCATATAATA CCTTTAATTT 3000 

ATCAAATTTG TTATACGTCA TATATAATGG 3060 

ATAAATCGCT TTGATTTCTG ATAATGAAAA 3120 
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TGTTTCTTCA AAAGTTTTTG GATGAAAAGT 
CATATATCCA AAACTTTCTT GATATTTTTT 
^ CTCCATAGAA ATTCTCATTA TAAATTTCTT 

CACAGCCTTC ACAGTTATCT CCAAAATGCT 
GTGCGTGATA CGTATCTAAA TAGGTTTCTT 

10 

CATATTTGGC TAGTCCTAAT ATW^CGATACA 
ATCGCTCTAA TCGAGACGTG TCAAATGGCT 
TCATCATTGC CATACGTTGT AGGGCTCCTT 
TATTAGCTAA GTATTCAATA GGTAAACGCA 
GATTTTGAGT TGTATTTTTA CCTTCAAAAT 

20 ACCAAACCAT CGGCATCGTT CTAAATTCAG 

TTGCTAACTT ATAAATTGGA GAGTTTTGTG 
CTTTTTCAGC TTGAGCAATG ACTTCTTCGT 

25 TTTCATATAA ATCTTTCTCG TCTACTGCTG 

ATAATAAAAC ACCTAAGTAA CGCATACGTC 
TACCCGCCTC GATTCTCGGG AAACAGAAAG 

30 

TGAAGTAAAC TTTCTTATAT GGACAACCTG 
CTTGGTCAAC TAATACAATG CCATCTTCAT 
ATGCAACGCA ACTTGGATTC AAGCAATGTT 

35 

TTTCJGTCAAA TTGGAATTTA ATATCTTCTT 
CTGfAACATG ACCACCTGCT AAGTCATCTT 

^ TATCCCCCGT AATTTCTGAA TACGCTCTAG 

TTGTTAAATG TTCATAATTA TAGTTCCATG 
CTGGGTTATA AAAAATTTTA CCTAAAGCAA 

45 CAAGTTTCCC TtTACGATTT AGTACCCAAC 

GTTTCXXyVTA CCCTACACCT GGCtTCGTTT 
CTGGACGATT TGTCCaAGTG TTTTTACATG 

SO 

TATCTAAATT TAATACCATC GCAAcTTGCG 
ATCTTTCTAA CTGCTACATA TAAATCCCTT 

55 



TAATTTTTCT GGAAAACATA ACTGTTGTGC 3240 

AAAATTATCG AAATTAATCA CGGAAAATCC 3300 

GACCAGTTTr CCCTGAACCT ACTGCAACGC 3360 

CGCCGCCGTA ATTGTATCCT GTACTACCTT 3420 

TGTGTGATGT TGGAATAACA AATCGATCTT 3480 

TGTCTTTAGT TTGGCGCTCG GTTATACCTA 3540 

GTTGAGTAAC TTGAGATCTC ATATAACTTC 3600 

TTACTGGCTC TGTATCTCCT GCAGTGAAAA 3660 

TTTCTTCAAT GGCTGGGAAA ATCGCATCTG 3720 

AGCTCATAAT TGGGCTAAGT GGTGGGCAAT 3780 

GATGTAACGG AAATGCAAGT TTATATTCAA 3840 

CAGCTTCAAT CCAATCGTAA CCAATACCAT 3900 

CAAATGGGTT TAAGAATATA TCTAATTGTT 3960 

AAGCTGCTTC ATGAACTCGA TCTGCATCAT 4020 

CTGTACAAGT TTCAGAGCAT ACCGTAGGCA 4080 

TACACTTTTC AGCTTTGTTC GTTTTCCAAT 4140 

TCATACAGTA ACGCCATCCA CGACATGCGT 4200 

CACGTTTATA CATAGCACCT GAAGGACACG 4260 

CACATAAACG TGGTAAATAC ATCATAAAA6 4320 

CTATTTnTG GATGTTAGGA TCTTTTGGAC 4380 

CCCAGTTAGG TCCCCATTCA ATTTCAATGT 4440 

CAACTGGCGA ATGCTTCCCT GATTTCGCAG 4500 

GCTCATAATA ATCTTTAATT AATGGCATAT 4560 

TTTTTGAAAT TCTACTTCCA GATTTTAATT 4620 

CACCTTTGTA GTGTTCTTGG TCTTCCCAAC 4680 

CTACGTTGTT GAACCACATG TACTCAGCAC 4740 

TCACACTACA CGTATGGCAT CCTATGCATT 4800 

CTTTAATCTT CAAGCCAATT AACCTCCTTC 4860 

TGGTTCCCAA TTGGTCCATA ATAATTAAAG 4920 
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GGCGCGTTGT GTGAACCACC ACGTGTATCT GTAATTTCTG ACCCAGGCGT TTGAATATGT 5040 

TTATCTTGTG CATGATACAT AAACATTGTA CCTTTAGGCA TACGATX5CGA AATAACTGCT 5100 

5 CTTGCCGTTA CAACACCATT ACGGTTATAC ACTTCTAGCC AATCATTATC TTGGATATCG 51 SO 

TGTTTTTCAG CATCTTCATT TGATATCCAA ACCGTTGGAC CACCTCTAAA TAGTGTCAAC 5220 

ATATGCTTAT TATCTTGATA CATTGAGTGT ATATTCCATT TTCCATGAGG CGTTAAATAA 5280 

10 

CGCAgTACCA AAGCATCTGT ACCACCTTTA ATTTTCTTAT CTCTATTCCC AAATACCATT 5340 

GGCGGCAATG TCGGTTTATA TACTGGTAAG CTCTCCCCAA ATTGTTGGAA AACTTCGTQA 5400 

TCCACATAAT AACTTTGACG TCCTGTTAAT GTTCTAAAAG GTACTAGACG TTCTATATTC 5460 

GTTGTAAATG GTGAATATCG TCGACCTTGT TTATTTGAAC CTGGGAATAC TGCTGTCGGT 5520 

ATTACTTCTC GTGGTrGTGA AGTTATATTT AAAAACOAAA TTTTCTCAGC AGCGCGTTCG 5580 

20 CTAGAAATAT CTTTTAACGG CATTCCAGTT* TGTTCTTCGA GATCTTCATA TGATTTTTCT 5640 

GATAATTTAC CATTCGTAGC AGATGAAATA CTTAGTATTG CATCAGCTAC ATTACGTGCT 5700 

GTATCAATAC GTGGACGATT CGCTCTCACA GAATCATCAT TTGTATCACT CCACGTACCT 5760 

25 AACATACTTT TTAATTCTTC ATATTGTTCA CTGACACCGA AACTTACACC ATGTGCTCCA 5820 

ACTTTCCCTT TTTCAAGTAC AGGACCAAGC GTGACATATT TGTCGTAAAT TTTAGTGTAG 5880 

TCGCGTTCTA CAATTGCAAA GTTAGGCATT GTACGTCCAG GTACCGCTTC AATTTCACCC 5940 

TTCGACCAAT CTTTCACTAC GCCGTATGGT GTTGAAATTT CTTGCTTTGT ATCATGACTA 6000 

AGTGGAGTTG TCACAACATC TTTAAACGTT CCAGGTAAAT AGTCTTTTGC CATTTCTGAA 6060 

AATGCTTTTG CCAACGTTTT ATAAATATCC CAGTCTGAAC GCGATTCCCA TAACGGATCA 6120 

3S 

ATGGCAGGAT TGAAAGGATG TACATATGGA TGCATATCCG TTGATGATAA ATCATGTTTT 6180 

TCATACCAAG TCGCTGCCGG CAAAACAATG TCAGAATATA ACGGTGTTGC CGTCATTCTG 6240 

^ AAGTCTAAAG AGACCACTAA ATCTAACTTA CCTGTTGTTT CTTCACGCCA CGTAATTTCT 6300 

TCTGGCTTTT CATCTTCATT TGGTGTAGCT AATAACCCTG ATTTTGTGCC AAGTAAATGC 6360 

TTCATAAAGT ATTCTTGACC TTTTGCAGAA CTTGAAATTA AGTTTGAACG CCATATAAAT 6420 

45 AATGATTTTG GATGATTCTT TTTCAAATCA GGATCTTCTA TTGCAAATTG TGTTTGTTTT 6480 

GATTTCACTT CATCAATTGC ACGTTGCAAA ATCGCTTCAT TTGAATCTAT ACCTTCATCT 6540 

TTAGCTTCTT CTGCAAACAA CAAACTATTT TTATTAAATT GTGGATATGA TGGTAACCAA 6600 

CCAAGTCTAG CTGCTAAAAC ATTATAATCA GCTGGATGTT GATGCTTTAA CTCCTCTGTT 6660 

TTAGCTAATG GAGATTTTAA ACGATCTACA TTTGACTCTT CATATTTCCA TTGGTCTGTT 6720 
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AATGCGACAG TACTCCATCC TTCAATCGGA 
CAACCGCCAC CATTCACACC TTGACAGCCA 

5 

TAAATCGTAT CTGAGTTAAA CCAATGGTTA 
CCTTCAGTAT CGATAGCGTT TTGCGCAAAT 
TTTACGCCTG AAATGGCTTC TTGCCAAGCA 

10 

CCTTTTGATT CTAATTTATG ATCAAAACGA 
AAAATTGTAG CAATACX3GAC TTTGTCACCA 
GGACGATTGA ATATCCCATC TCCATCACTA 
TCGTATCCAC CTTCTGTCAT TGATAATGTA 
AGTTTTAAGT TCCACTTCTT ACCTTCTTCC 
20 ACTAAACTAT CGCTGATTGC ATCATGAATA 

TGACCTAAGT CACTCGCTCT TAAAAATCGA 
AGCATGATAA GAAACGGCAT ATCTGTATAT 
GGTTGATTAA CATAATGTTC TTGTAAAATA 
TCTGAACCAG GATTCGGTGC TAGCCAGTTA 
GGTGCTACTG AAATGACTTT TGTACCTTTA 

30 

GGAGTACGTG TTAAAGGTAC ATTAGAGCCC 
TCACTTGATT CAGGCACATC TGTTTGCTCT 
TCTGCATACC AGTCATAAAA ACTAAGCATT 

35 

CCTGCTGCAT AACTAATCAT TGACATCGCT 
GGACCATATT TTTTTATTGT ATACAGTAAT 

^ CAATTTGAAC GCACGTGCCC TCCCATACCT 

TCATTTTCAA CAATAGACGC CCATGCAGCA 
TCAGTCCATA AATCCCAGAG TTTTCCACGA 

45 TATTCATACC AAGAGAATGA CGCACCTCGT 

TCCGGACCAC AACTTGGATA GTCAGTTTGT 
ACAAATACTT TCCAAGAACA TGAGCCTGTA 

SO 

TTATCGTGGC TCCAACGTTC TCTGTACATT 
ATCGACCAAT TCCCATTAAA TTTTTCTGTT 

55 



CGACATTTTT CTTGTCCCAC ATAGTGAGCC 6840 

CATAACATAA CTAAGTTTAA GATTGAACGA 6900 

ATACCCGCAC CCATGATAAT CATTGAACGC 6960 

TCTTTCGCTA CTTGAATGAC AACACTTTGT 702 0 

GGTGTATATT TTGATTCTGC ATCGTCGTAT 7080 

CGCACX3CCAT ATTGACTTGC CATTAAGTCA 7140 

TTTGCTAAAG TGACTTGTOG AGTTGGAATT 7200 

TCAAAGTATG GGAATTGAAT TGTTTCTAAT 7260 

GGGTTAATTT TAGAACCATC TTCTGTTTCT 7320 

CAACGTTGAC CCATTGTGCC ATTAGGTACT 7380 

ACTGGCTTCC ATTCGCCTTG CTCTGTTGTT 7440 

CCCGCTTTAT ATCCATTTTC ATCTTCATCC 7500 

TGTTTAGCGT AATTTATAAA GCGTTCATTA 7560 

ACATGCGTCA TTGCTTGTGC AATTGCAGCA 7620 

TCTGCAAATT TCACATTTTC TGCGTAATCT 7680 

TAGCGGACTT CAGTCATAAA ATGTGCATCC 7740 

CACATAATAA TGTATGATGC GTTATACCAG 7800 

CCCCAAATTT GTGGAGAGGC AGGTGGTAAA 7860 

TCACCACCAA GCAAATTGAT GAATCGAGCA 7920 

GGAATAGGTG TAAATCCTGC GATTCGATCT 7980 

TGTGCTGCGA TTATCTCTGT AACGTCTTTC 8040 

CGGGCTTGCT TATATTGTTT GGCTTTGTCT 8100 

ACGCGATTAC CATTGTTTTC TTCTAATGCT 8160 

ATATATGGAT ATTTGATTCG AAGCGGACTG 8220 

GGACATCCTC TCGGTTCATA TTCAGGCATA 82 80 

TGATTTTCCC AGGTAATCAC ACCATTTTTC 834 0 

CAGTTAACAC CATGTGTTGT TCTTACTTCT 8400 

TTTTCCCATT CTCTACTTTT ACTTTCTAGG 8460 

GGCTTAAAGA AATTCAATCC AAATTTTCCC 8520 
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TAAAATGCCC AAGACTATTG CTTTAATTAG ATTGTACATT TTTTCACAAA CATAAAATAT 8640 

TAGGGAATCA CCTAATTACT TAAGGAATTT CCCTATCAAT AACXjGGATTT CATTGAAATA 8700 

^ ATACACAATC ATGTATGGTC ATGCTTATTG CCAATCTAAA TCGTTCAAAT TTGGCACAAC 8760 

GACAAATAAG GCTTCAACAC GAATATATTC TCTCGGTTGA AACCTTACTT ATTCATTTAT 8820 

TTTTTATAAA TTAGTGACAT AACACTGTAT TAGCATCTGC ACGATCGGTT GAAATATATG 8880 

10 

TTACATTTTC TTGCTGCTTA ATAAATGCAT CATAGTAATC ATATTGCGAC GAATGATATG 8940 

TGCCATTCGA TGTATCATTT GGGTTTAGCA AACAGCCATA ACCTTCGTCA TATAAATGTT 9000 

CACAGAGCAT AAGGGCGTCA TGTTTAGAAC CACTTACTAC ATAAAATTGC TTCATAGGAT 9060 

CATATGATTT AGGAGTGTTT TCAGTATAAT CAACAACTTC CCCTATAATA CATATACCTG 9120 

GTTTCGCCTC AATTGAATAG TGTTGCAATT TTGAAATAAT ATTACTTAAA CGCCCCTTAA 9180 

20 CAACAAACTC GTTAAAACAC GATGCTTGAA AGACAATCGC TATCGGGTAA TCAATATCTG 9240 

TGTATTGTTG TATCTGTGTG ATAATTTTCC CTAAACGTTT TACCCCCATA TAAATTGCTA 9300 

ACGTGCCACC ATTCACTAAG GAATTGACAT CCACTTCATT TTCTTCTOAA TCTTTAAAGT 9360 

2S GACCTGTAGA AAATGTCACA CTTTTAGCAA CTGTACGCAT TGTCAAACCT GTCTGCATAG 9420 

TAGCAACTGC tGCGCTCGCT GATGTCACCC CTGGTACAAT TTCAAACGCA ATATGATGTT 9480 

CATTTAGTAT GTCGACTTCT TCTTGCACAC GACCAAATAT CGCTGGATCG CCACCTTTAA 9540 

30 

GTCTAACAAC CTTGTTATAT CGACGCGCTG CTTCCACGAT ACAGTCATTT ATTTTTTCTT 9600 

GCTGAATATG TTTTGCATAC GGCTTTTTAC CAACATCGAT AATTTCAGTA GTCAAATTCG 9660 

CATATTGTAA AATTAACGGA TTCACTAATC 6ATCATATAG AATGACATCC gCTTCACXSTA 9720 

35 

TTAAACGCTC AGCCTTTTTC GTCAAATAAT TCGGATTACC TGGACCCGCA CCTATCAAGT 9780 

AAACCTTGCC ATATTCCTCT ACAGACATAT ATATACGTTC CCGTCTGTAA CTTCTACCTC 9840 

^ ATAAACATCT ACACAACCTT CATCAGGTTC TTGAACAATA CCTGTATTTA AATCAATTTT 9900 

TTGATCGTGG AGCGGGCAAA ATACATATTC CCCACTCACT GTCCCTTCAG ACAATGGTCC 9960 

TTGTTTGTGT GGACAGATAT TGTGAATCX3C ATGAATTTTG CCACTTTCTG TTAAAAACAA 10020 

.45 CCCTACCTCT TTGCCTTTGA CAATAACCTT TTTTCCAATT AGGGGTGTTA ATTCATCTAT 10080 

AGTTGTCACT TTAATTTTTT CTTTTGTTTC CATGTATTAC ACCTTCTCCA CTTCAAAAAT 10140 

TCTACGTGCT TGAGCATTGC TAGTTATTGC TTCCCAAGGT TCAGCTTCGA CTGCTTTTTT 10200 

SO 

AGCATCCATA ATGCGTTCAA ATAGTTCATT TTGTCTTTCT GGGTCAAGTA AGACTTCTTT 10260 

TACATTTTCA AATCCAAGTC TTCTTAACCA TGGCGCTGTT CTTTCAGCAT ATATACCTGT 10320 
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AGTTGTTAAA AATTCAGCTT TTTCAACTTC TGTACCACCA 


TTACCACCGA 


TATAGATTTG 


10440 




GAATCCATTT TCAACTGAGA TAATACCAAA ATCTTTAACA CCTGATTCAA CACAACTTCT 


10500 


5 


TGGGCAGCCT GATACACCCA TTTTGAATTT ATGAGGTGTA TCGATGTATT CAAATGTTTT 


10560 




TTCTAAACGA ATGCCAAGTC GTGTCGTGTA TTGCGTACCA 


AATCGACAAA 


ACTCTTTACC 


10620 


10 


AACACAGCTT TTAACTGAGC GTGTTTTCTT ACCATAAGCT 


GATGcTGAAC 


GCATACCTAG 


10680 


GTCTTCCCAT ATATTTGGTA ATTCTTCTTT TTTAACTCCA 


TACAAACCAA 


CACGTTGTGA 


10740 




ACCTGTCACT TTAACTAGTG GCACATGATA TTTCTTAGCC 


ACTTCTCCTA 


GACGAATCAG 


10600 


IS 


TTGGTCTGCA TCTGTAACAC CCCCACGCAT TTGAGGTATA ACAGAAAATG TACCATCATT 


10860 




TTGAATATTC GCATGGTAAC GTTCGTTAGC AAATCTTGAT TCTCTTTCAT CTTCATGATC 


10920 




AT6TGGATAA ACCATGTTTA AATAATA6TT GATTGCTGGT CGACATTTTG GACATCCACC 


10980 


20 


TTTATTTTTA AAGTTTAAAA CATGTCGAAC TTCTTTAGAT 


GTTTTTAAAC 


CTTTCGCTCT 


11040 




TATTTGCGTT ACTATTTGAT CGCGTGTCAA ATCAGTACAA 


CCACATATAC 


CAGCAGGTTT 


11100 


2S 


TXSCGGCAACA AAGTCATCTC CTAAGGTGTG CTGCAATATT 


TGAGCAATTT 


GCGGTTTACA 


11160 


TTTACCACAT GAATTCCCCG CriTlXi'lTrr AGCCGTTACT 


TCTTCAACTG 


TTGTAAAGCC 


11220 




ATTTTCCGTA ATCGCATTTA CTATAGTACC TTTATCAACA 


CCATTACAAC 


CACAAATTGT 


11280 




TTCATCATCA GCCATATCAG CAATTGATAG CGATGCCTCT 


TCTCCACCTT 


TAGTAAGCAA 


11340 


30 


TGATACAAGT GTGTAATCTT CAGTGGATTC ACCTTTTTTC 


ATCATGTTAT 


AAAAGCGTGA 


11400 




ACCATCATCG ATATCACCAT ATAGTACTGC ACCAACTACA 


TTACCGTCTT 


TTAAAAAGAT 


11460 


3S 


TTTTTTATAG TTATTATCAA CACTATTAAA TATTTCAATA 


CCTTTAATTT 


CTGCATTTTC 


11520 


TACAATTTGA CCAGCACTAT ACAAGTCACA CCCAGAAACT TTTAATGACG TAAATGTTGT 


11580 




TGATCCCTTG TATCCGTTCG TTTCTTTATT TGTTAAATGA TCAGCTAATA CTTTACCTTG 


11640 


40 


TTCATATAGT GGTGCAACGA GTCCATAAAC TTTGCCGTTA TGTTCTGCAC ATTCACCAAC 


11700 




TGCT^TATACA TTGCTATCAC TTGTTTGCAT CACATCATTG ACAACAATAC 


CACGATTAAC 


11760 




ATCTAGACCT GATTCTTTGG CTACTTCTGT GTATGGTCGT 


ATACCTACTG 


CCATAACAAC 


11820 


4S 


TAAGTCTGCC GGAATCTCGC GTCCATCAGC CAATTTAACA CCCTCAACAT CATCTTCTCC 


11880 




TAAGATTTCA GTTGTGTTGG CTTGCATTTC AAACTTCATA 


CCTTGCTTTT 


CTAGATCTGC 


11940 




TTTAAGCATA TTTCCAGCTT TACGGTCTAG TTGCATTTCC 


ATCAACCATT 


CAGCTAAATG 


12000 


SO 


TAACACCGTT ACTTCCATAC CTTGATCTAA TAAACCACGT 


GCACACTCTA 


AACCTAGTAA 


12060 




TCCTCCACCA ATTACAATTG CTTTCTTTTT AGTCTTAGCA 


ATGTTCATCA 


TTTGTTCAGT 


12120 
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10 



IS 



20 



25 



30 



35 



40 



45 



SO 



GAATGCTTTA GAACCTGTCG 
AGTAGTAACT GATTGATTTG 
GATACCATGT TCCTCATACC 
ATTTTGTAAA ATATTTGAAA 
TACCGTAATA TCATATAAAT 
CGCCATACCG TTACCAATCA 
CCATAATATT TATTTCAAAA 
GGAATCATTA AGCTTTCTAA 
ATTGAAGGTG TGAAGTGTAT 
TTGTTAACAA GTCTTCCGTC 
TTCGAGATGC TTTCTAAATC 
GTCGGCTTGC TAATTTGCAA 
AACTTTCCAT TAATATTGCC 
GCTAATGCGT CACAAATACG 
GGCTCGCTTA CTTCTACCTT 
GGTATATCCT TGAGATAATG 
TCAACCCCAC TTTGAATCaA 
AAAAACGCAA TATCATAGTG 
AACGCTTGaT TCTGTCGTCC 
TTTACCAACC CTTTCACACG 
CATXATAATG TAAAATCAGG 
TTTTCCCTTT TTGTTAAATC 
TTTGAGCAAG CATTAATATA 
TTGGCCTAAT ATTGTTTCGT 
TAAATCGCCA TCATCATTTT 
TTTAAGTAAC CACGGATGCA 
CGTATCTCGC AAAAATGCTT 
TTCATACTCA GGATTTGTCG 
ACTTCCCCAA GGATATCTAA 



CAAAAATCAA TTTATCGTAT 
CTCTATCTAC TTCAATTACA 
ACTCATATGG ATTCATAATT 
GCATGATGCG GTTATAGTTT 
CGTTGGCGCG CTCTAATATT 
TTACTAGTTT TTGCTTTGCC 
AAAGGTATTA ATTTTTCGTT 
TCTATCGTTA ATGATTTGCT 
ATCTGTATTA ATAACCATGT 
ATATAAAAAT AATGGTACGA 
ATGTGTAAAA CTAATCTCTC 
ATTTTGAGCG CATATTrGTA 
GTGTGCAACA ACCATAACTC 
TTGTTCAATT AATCGTCTCA 
TATGTCTGGA TACCGTCGTT 
CATTGCACTA AAGATTAGCA 
CGTCGTCaTT ACCGTCTCTA 
ATGTATATCA TCTTTTACTA 
GTGCCTCATG CCATGTGCAA 
TATTGTATAC CAAATCATTT 
GAATTCCCTG ATGCCTGTAG 
AAAAAAAGCG ACCGATATAT 
TCGGTCGCTT GTAGTGTATA 
CAAAGCGCTC GGGTATCAAT 
CATGTTCGCT GTATATTTCA 
ATCTTGCAGA TGTACCTAAA 
CTTCAACATA AGTAAGTAAT 
CAAACCACCA GACAAAAGGA 
CCGTAATCGT AGATATAATT 



GATACTTCAA TACCATTTGC 12240 

GGATCATTTG TAATTAACTC 12300 

GTTTCTTCAA CTGTCATTTT 12360 

GGATAAGGTT CTTTACCTAT 12420 

TCTTCGATTG TTCGAATGCC 12480 

ATAAAATATG CCCCTTTACT 12540 

AGTGCTTTTA TATTTTCATT 12600 

TTAAAATTGG GTCGAAGTTA 12660 

CATTCATTTG CTGCTTCACT 12720 

CAATCAATTT TTGATACCGT 12780 

CATATAGCGT TCTCGCATAT 12840 

ACTCTTCGTG TGCCTTA6TA 12900 

CAACTTGTTG TTCGTCACCT 12960 

TTAAAGGATG TGTGCCAAGT 13 020 

TCATTTCATG AACGATATTC 13080 

ATGGTACAAT TTTAAAATGG 13140 

AATCCtGATG CTCACTTTCt 13200 

ATTCAGAAAT AAATGCTTCT 13260 

CAATGATATT CCCATTCACA 13320 

TGTTTTTGTG AAAAGAATCA 13380 

TCATGCATAT TCCTTATACA 13440 

GAATCCCTAC TCAACATTTA 13500 

TTATTATCTT AAAATGGTGG 13560 

ACTTTGCGCA TGATCACACC 13620 

TAACCTCTTT TTTCATAAAT 13680 

GTAACTGCCG CTGACTTTAA 1374 0 

TGGCTACCAT AGCCTTTCCC 13800 

TAACCCGAAA TACTTTTCAC 13860 

TCATCATCAA TTGTCATGAC 13920 
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CCAATCAATA CCTAGTTCTC TTAGAgGCGT AAATGCTTCA TGCATGAGTT CTTGCAATTT 14040 
TTCTGCATCT T 14051 
^ (2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1S8S base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



75 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

TAATCCTCAA CTTnGATTAT ATGGCTTGGG CGCATATGAA CTGCTTAGTT TAGTGTATGA 60 

CATTCATACA GTTOGCATGA CTATCATACA ACCTCGAATA GATAACTTTT CTACTGAAGA 120 

20 GTTACCAATC TCAAGATTAC TTCAATGGGG AACCGATTTT GTTAAACCCT TAGCCAGACT 180 

TGCTTATAAC GGTGAAGGTG AGTTTAAAGC AGGTAGTCAT TGTAGATTCT GTAAGATAAA 240 

GCATTCATGT AGAACACGTG CAGAATACAT GCAAAATGTG CCTCAAAAGC CACCACATTT 300 

2S GTTGAGTGAT GAAGAGATTG CAGAACTTTT ATATAAACTG CCTGATATCA AAAAATGGGC 360 

TGATGAAGTA GAGAAATATG CGTTAGAACA AGCGAAAGAG AATGATAAAA CGTATCCAGG 420 

TTGGAAGCTA GTCACGGGAC GTTCAAGGAG AGTGATAACT GATACAAAAG CAGTCCGAGA 480 

CAGGTTAGTT GAAGCGGGTT ATAAACCTGA AGATATTACA GAAACCAAGT TACTTAGCAT 540 

TACGAATTTA GAAAAATTAA TCGGCAAAAA AGCATTTTCT AAAATTGCAG AAGGCTTTAT 600 

AGAAAAGCCG CAAGGTAAAT TAACACTTGC TACCGAGTCT GATAAACGAC CAGCTATAAA 660 

GCAATCTOCT GAAGAT6ATT TTGACAAACT ATAAAAATTA AAAAGGACGG TATATAAACA 720 

TGA^tiAGCAAA AGTATTAAAT AAAACTAAAG TGATTACAGG AAAAGTAAGA GCATCATATG 760 

CACaTATTTT TGaACCTCAC AGTATGCAAG AAGGGCAAGA AGCAAAGTAT TCAATCAGTT 840 

TAATCATTCC TaAATCAGAT ACAA6TACGA TAAAAGCCAT TGAACAAGCT ATAGAAGCTG 900 

CTAAAGAAGA AGGAAAAGTT AGTAAGTTTG GAGGCAAAGT TCCTGCAAAT CTGAAACTTC 960 

45 CATTACGTGA TGGAGATACT GAAAGAGAAG AT6ATGTGAA TTATCAAGAC GCTTATTTTA 1020 

TTAACGCATC AAGCAAACAA GCACCTGGTA TTATTGACCA AAACAAAATT AGATTAACGG 1080 

ATTCTGGAAC TATTGTAAGT GGTGACTATA TTAGAGCTTC AATCAATTTA TTTCCATTCA 1140 

ACACAAATGG TAATAAGGGT ATCGCAGTTG GATTGAACAA CATTCAACTT GTAGAAAAAG 1200 

GCGAACCTCT TGGCGGTGCA AGTGCAGCAG AAGATGATTT TGATGAATTA GACACTGATG 1260 
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TTGAGGTGTC AAGAATTTGA AATTTATGAA TATAGATATT GAAACATACA GCAGTAACGA 1380 

TATTTCGAAA TGTGGTGCCT ATAAATACAC AGAAGCTGAA GATTTCGAAA TTTTAATTAT 1440 

AGCTTATTCG ATAGATGGTG GAGCGATTAG TGCGATTGAC ATGACTAAAG TAGATAATGA 1500 

GCCTTTCCAC GCTGATTATG AGACGTTTAA AATTGCTCTA TTTGACCCTG CTGTAAAAAA 1560 

GTATGCATTC AATGCTAATT TCGAAAGAAC TTGTCTTGCT AAACATTTTA ATAAACAGAT 1620 

GCCACCTGAA GAATGGATTT GCACAATGGT TAATTCAATG CGTATTGGCT TACCTGCTTC 16 BO 

GCTTGATAAA GTTGGAGAAG TTTTAAGACT ACAAAGCCAA AAAGATAAAG CAGGTAAAAA 1740 

TTTAATTCGT TATTTCTCTA TACCTTGTAA ACCAACAAAA OTTAATGGAG GAAGAACrAG 1800 

AAACCTACCT GAACATGATC TTGAAAAAtG GCAACAATTT ATAGATTaCT GTATTCGAGA 1860 

TGTAGAAGTA GAAATGGC6A TTGCT 1885 
20 (2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2656 base pairs 

(B) TYPE: nucleic acid 
(C} STRANDEONESS : double 

2S (D) TOPOLOGY: linear 



10 



IS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

TAATCCTTAG TTCACTGnCA AATTTCAAAA CACCAGTTCC CTCTATCTGC ATCCATAGAA 60 

ACTGnATGTT TGTGTCAATA ACCGGATTAT ATTGTGATGn TGTTTGTAAC TCGATTAAGT 120 

TATCATCTTT CGAAAAATTA TCTACTACCA TTATTCAACC ACCTTTCCTT CGAATAAACT 180 

35 

CCATTTACCA ACkCCACCAG TACCAAAGTT TCTAACTAAA AATTGATGTG CAGACGGGAA 240 

GTTXTTACGT CTTAATACTT GTGTTGTATT ACCTGGTGTA TTOGATTTTA CTTCTAATAT 300 

CCAACCTGCA ATACCTTTAA AGTCTTTAGG AAAATCAGTA AATCGGTTTG ATTCTTCAGT 360 

40 

AGTGATATAG AAATCTAAAC CAACGATTTT TAAATCTGAT AATTTTGTAA TACTCTTAGG 420 

GATATGTTCC CAATAACCGG CGTTTTGCGG GCAGAAATTC CATGCTCCGT TGTTTTTCTT 480 

45 ATTGAAAAT6 TCAATGACAC GTTCGAATTT AAGCATATTT CTACCTGTGC TGTTTCTGGt 540 

AAGTACTTGT CTTAGAGCAC CATTATAGTG TCCAGGCAGT ACATCCAAGA ACCACCCTGC 600 

ATCTCTAAAC GCTTTCGGTA ACGGGAAATC TAATGCATTT TGTGTGTCTT GaCGTATAGA 660 

^ TATAGTAATG ACCAACTTCC GTAATATCAC TTAGATATGC TGGGTTCTGT ATTGGTAACG 720 

GTTTAACACG TCCGCCTGAA TCAGTCATTG ATACTTGAGG TGCGATGTTT TTCAAGAATT 780 
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TAGTTACCCC GATTAGAAGT GCTTTACGTC CTGTTTCTAG ATCGTAATAC ATATCTAGAC 900 

CCTCAGCCTC TTGGAAATCT CCTTTAAAGT TGTTATTCAC ACCGCCTATA TCGATGCGAC 960 

GTTTAAATAA CAATTCTTTC GTTTTGATAT CGAAGCCTTG TAAGTAGTTA GGGTTGGCTG 1020 

TATTCGAATC ACCTGTATAC CAATATAAGA TACCTGCATC ATAAGTX3ATA CCTTGCATAG 1080 

GTTGTGTATC TGAAGTGTAT TCCATAGGTA TATCCATTTG ATACAATACT TTGTCTATAC 1140 

CTTTATCAAT ATCGTCAGCA CTTCTAACCT CAACAAAGTT CAACGAATTC TTAAGTTQTC 1200 

TTTCAGTGGG TTTATATTCA CGTCTAAAAA TCAITAAATT TTCTACCGGA TTATAAATCG 1260 

CTGACGTATA TCTGTCGTTA AATATATTCG GCATGACATC TTGCATTTCA TTACCATAAG 1320 

TTATTTCTCC AGTTCTATAT TOGAAACGTA CAAACTTGTT GTTTTTGTTA CTGTCCAATA 1380 

CAGCTGAATA AATCCATAAT TCTCCATCAA TGTATCTATA CGCATTGTGT GTACCX3TX3AC 1440 

CGCCGTTTTT AACAAGCAAT CTATCAATAA ATTGTCCGTT GGGCTTCAAT CTAGATAACA 1500 

TGTAATGATT ACCTGGACGA GCTTGCGTCA TATAAATAAT TTTCX3TTCTA GGGTCTACCC 1560 

AAAATGATTG CATTACTGCA TTTGTATATG GCGATAAATC AGTGATAAAT TCCGGTTCTT 1620 

25 GCTCTTTTGG TTCGAATCGG TATTCTGTCG CTCX3ATATTC TTTATAGTGT TCATCTACAG 1680 

CTTTCTCAAC CTTTTTAGTG AAAACATCTA GTGTTGAATA ATCATGATAC AAACGATCTT 1740 

GCAATGTCTT ATGACCATAA CCTGTATTAT CAACGCGCGC GTCTTTTAcT TCGTTGATAC 1800 

CGTCGCCGTT ATGACCTAGT ACCATGTTGC TAAATCGACC GTTTAAATAT GTTAAAAAGT 1860 

CAGAGACGTT ACTTGTAACA TTTAAATGTT CATACTTTAT TTGTTCTCCA TCATGTGCGA 1920 

ATACCTCTTT ATTTCTGTGG TATTCAAGAG AGAAATTAAA ATCCGTCAGC ATGTCTGAAA 1980 

TAAGTTTAAA GTTATACTCA TTTTCATCTA CATATCTGTA GTCAAAGACT CTACTTAAAT 2040 

CTGTAATTAG TTTATTACTC ATGTTTTCCT CCTTTACTAT CCATAAAACT GATmATAATT 2100 

TTTAATAAGC TCATACATAA TAACTTCATG ACCTCTTTCA TTAGGATGTA ATCCATCAGG 2160 

CATGCTAGAT TTTCTAAATG CTGGATTATA TGGTTTGAAA TAATCTGTGT GATAAGCATC 2220 

ATATACTGGT ACATCCAATT CACTACAAGC CAATATCTGA GCATTGACAT AATCCTCTAA 2280 

AGTTAACCCT AGTTTGTTTT TGTCCGTATC TTTACGGCGT ATCGTTGTAC CACTCATAQG 2340 

GCATTGCCTA GTAGCTGTCA TTACAAGTAT TTTTGAAGCT GGATTATTTT TCCTGATAAC 2400 

TTCAATTGCA GAACAAAAGG CGCCGTAAAA CGTTTTAGTG TCGGTTTTAT CAGTGCCTAT 2460 

SO CX5GTACGCCT GCCCAATAAC CATGTAACCA GTCATCATCT GTACCTTGTA ATATGATTAG 2520 

GTCTCCTCTT ATTTGCTCTG CTTGTCTaTA AATGCTGTTT TCTaCCGCTT CTTTACCTAT 2580 
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CTTGCCTAAC ATTTCT 

(2) INFORMATION FOR SEQ ID NO: 106: 

5 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4854 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



IS 


AAAATGAGGG TTCTAGCGGA AATTACCAAA AGCGTGGTTC 


ATACTATGGG 


CAGCGTAATC 


60 




GTATTTCAAA AGAAAAAACA CCTAAATGGT 


TAGaAAATAG 


AGATAAACCT AGTGAAGAAG 


120 




ATTCGGCTAA AGATAATAGC GTAGATGATC AACAATTAGA 


GCAAGATCGA 


CAAGCATTTC 


180 


20 


TAGATAAATT ATCTAAAAAA TGGGAGGAGG ACAGTCAATA 


ATGAAGCAAT 


TTAAAAGTAT 


24 0 




AATTAACACG 


TCGCAGQACT 


TTGAAAAAAG 


AATAGAAAAG 


ATAAAnCAGA AGTAATCAAT 


300 




GACCCAGATG 


TTAAGCAATT 


TTTGGAAGCG 


CATCGAGCTG 


AATTmACGAA 


TGCTATGATT 


360 


2S 


GATGAAGACT 


TAAATGTGTT 


ACAAGAGTAT 


AAAGATCAAC 


AAAAACATTA 


TGACGGTCAT 


420 




AAATTTGCTG 


ATTGTCCAAA 


TTTCGTAAAG 


GGGCATGTGC 


CTGAGTTATA 


TGTTGATAAT 


480 


30 


AACCGAATTA 


AAATACGCTA 


TTTACAATGC 


CCATGTAAAA 


TCAAGTACGA 


CGAAGAACGC 


540 


TTTGAAGCTG 


AGCTAATTAC 


ATCTCATCAT 


ATGCAACGAG 


ATACTTTAAA 


TGCCAAATTG 


600 




AAAGATATTT 


ATATGAATCA 


TCGAGACCGT 


CTTGATGTAG 


CTATGGCAGC 


AGATGATATT 


660 


35 


TGTACAGCM 


TAACTAATGG 


GGAACAAGTG 


AAAGGCCTTT 


ACCTTTATGG 


TCCATTTGGG 


720 


ACAGGTAAAT 


CTTTTATTCT AGGTGCAATT 


GCGAATCAGC 


TCAAATCTAA 


GAAGGTACGT 


780 




TCGACAATTA TTTATTTACC GGAATTTATT AGAACATTAA 


AAGGTGGCTT 


TAAAGATGGT 


840 


40 


TCTTTTGAAA AGAAATTACA TCGCGTAAGA GAAGCAAACA 


TTTTAATGCT 


TGATGATATT 


900 




GGGGCTGAAG AAGTGACTCC ATGGGT6AGA GATGAGGTAA 


TTGGACCTTT GCTACATTAT 


960 




CGAATGGTTC 


ATGAATTACC 


AACATTCTTT 


AGTTCTAATT 


TTGACTATAG 


TGAATTGGAA 


1020 


4S 


CATCATTTAG 


CGATGACTCG 


TGATGGTGAA 


6AGAAGACTA 


AAGCAGCACG 


TATTATTGAA 


1080 




CGTGTCAAAT 


CTTTGTCAAC ACCATACTTT 


TTATCAGGAG 


AAAATTTCAG 


AAACAATTGA 


1140 




ATTTTAAAAT 


GATTGGTGTA 


TAATGAATAC 


AAATCTAAAT 


CGTTTAAATG 


ATTGAAGACA 


1200 


SO 


AGATGATCTA 


ATCAATATTA 


CACAGAAAGC 


CATTGTTTGA 


TGAGAATATG 


GTTAATAAAT 


1260 




TAGATGATTA 


CTACTTCATT 


TATGGTATTT 


GTAATGAATA 


CCCGGATCAA 


GACCGTTATC 


1320 
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CTCGTCCCTT GTATAGGGGC GGGATTTTTT GTTTTTTTCA GACATAAATG TTTGTTGGTG 144 0 

TCATAAATTC CCTGTTTATT GTTAATAGGT TTAATGTTAA AACGATGATT GTTGTTCAAT 1500 

TTrTTAACGA GGTCAGATAA AAGTATTTAT AAAGCAAATA GGAGGGTTTA ACATGGAACA 1560 

AATTAATATT CAATTTCCAG ATGGTAATAA AAAGGCGTTT GATAAAGGTA CTACTACTGA 1620 

AGATATAGCA CAATCAATTA GTCCTGGATT ACGTAAAAAA GCTGTTGCCG GCAAATTTAA 1680 

CGGGCAACTT GTAGATTTAA CTAAACCGCT TGAAACTGAT GGATCAATTG AAATTGTGAC 1740 

ACCAGGTAGT GAAGAagcGT TAGAGGTATT ACGTCATTCT ACTGCACATT TAATGGCACA 1800 

C6CGATTAAA AGGTTATATG GTAATGTTAA ATTTGGTGTA GGTCCTGTAA TAGAAGGTGG 1860 

ATTCTACTAT GACTTCGACA TTGACCAAAA CATCTCATCT GATGACTTTG AACAAATTGA 1920 

AAAAACAATG AAACAAATCQ TTAACGAAAA TATGAAAATC GAACGAAAAG TGGTTTCACG 1960 

20 AGATGAAGTC AAAGAGTTAT TCAGCAATGA TGAATACAAA TTAGAATTAA TCGACGCX3AT 204 0 

TCCTGAAGAT GAAAATGTAA CATTATATAG TCAAGGTGAT TTTACTGATT TATGTCGTGG 2100 

AGTTCACGTT CCATCAACAG CTAAAATTAA AGAGTTTAAA CTATTATCTA CAGCAGGTGC 2160 

ATACTGGCGT GGAGATAGTA ACAACAAAAT GTTACAACGT ATATACGGTA CTGCTTTCTT 2220 

TGATAAAAAA GAATTGAAAG CACATTTACA AATGTTAGAA GAGCGTAAAG AACGTGATCA 2280 

TCGTAAAATT GGTAAAGAGT TAGAACTATT CACAAATAGC CAATTAGTTG GTGCTGGTTT 2340 

GCCATTATGG TTACCTAACG GTGCAACAAT TAGACGTGAA ATTGAACGTT ACATTGTTGA 2400 

TAAAGAAGTT AGCATGGGAT ATGACCACGT TTATACACCA GTACTTGCTA ATGTTGATTT 2460 

ATACAAAACA TCTGGTCACT GGGATCACTA TCAAGAAGAT ATGTTCCCAC CAATGCAGTT 2520 

AGATGAAACT GAATCTATGG TATTACGTCC AATGAACTGT CCACATCATA TGATGATTTA 2580 

TGCQAATAAA CCACATTCAT ATCGTGAATT ACCTATCCGT ATCGCTGAGC TAGGAACX3AT 2640 

GCATAGATAT GAAGCAAGTG GTGCTGTATC AGGATTACAA CGTGTTCGTG GTATGACTTT 2700 

AAATGATTCA CATATCTTTG TTCGACCTGA TCAAATTAAA GAAGAATTCA AACGCGTTGT 2760 

AAACATGATT ATTGATGTGT ATAAAGACTT TGGTTTCGAG GATTATAGCT TTAGATTAA6 2820 

4S TTATAGAGAC CCTGAAGATA AAGAAAAGTA CTTTGATGAT GATGATATGT GGAATAAAGC 2880 

TGAAAATATG CTTAAAGAGG CAGCGGATGA GCTTGGCTTA TCGTACGAnG AAgCGATTGG 2940 

TGAAgCGGCA TTCTATGGTC CGAAACTAGA TGTTCAAGTT AAAACAGCGA TGGGTAAAGA 3000 

AGAGACATTA TCAACAGCAC AACTTGATTT CTTATTACCA GAACGTTTTG ATTTAACTTA 3060 

TATTGGTCAA GATGGTGAAC ATCATCGTCC AGTTGTTATT CATCGTGGTG TTGTATCAAC 3120 
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AGCGCCAAAA CAAGTTCAAA TCATTCCAGT TAACGTTGAT TTACATTATG ATTATGCGCG 324 0 

CCAATTACAA GATGAATTGA AATCTCAAGG CGTTCGTGTA AGTATTGATG ACCGTAATGA 3300 

AAAAATGGGT TATAAAATCA GAGAAGCTCA AATGCAAAAA ATACCTTATC AAATCGTAGT 3360 

TGGGGATAAG GAAGTTGAAA ATAATCAAGT GAATGTGCGT CAATATGGAT CGCAAGACCA 3420 

AGAAACAGTT GAAAAAGATG AATTTATCTG GAATCTAGTT GATGAAATTC GTTTGAAAAA 3480 

ACATAGATAG ACAGTTGTCG CAATAAAATG CTTTAAAACT TTTATTGCX5T ATCAAGTTTT 3540 

ACAGGGTTGA TTATGCGTGA TGAATCCTGT ATATTACAAG TTAGTTAAAA TATTAAATTG 3600 

AGTTAGAGGT TGCATGTTTA ATTAGTAACT , TGTCAGAAGT ATTTATGGTA CATAAGTTGA 3660 

ACAAGTGAAA GGTAAAGATG CCGAAATAGA TATAAACCAT AAATTATATC TATTCGGACA 3720 

GTTTTCGAAT AGGAACTGTA CTGTCACAGA ATGTGATGTG CTACCTTATA TAGATAATTG 3780 

20 CCAAAGTGGT TGCATATCTT AAAGGTATGT AGCCACTTTT TTACTTTTAA TATCACTATG 3840 

TTCTGTAAAA AAGGGTATGA AAGTGAATAA AGGTTATTTA TTTCTTGGCC TCTAAAACAT 3900 

G6AAAGGGAG CTTATATGTC AAAAGTTCAA AATGAAAGTA ACAATGTTGT CAAAAGGGGA 3960 

25 CTTAAAGATC GTCATATTTC TATGATTGCG ATTGGGGGTT GTATTGGTAC AGGTTTATTT 4 020 

GTAACTTCTG GTGGAGCAAT TCATGATGCA GGTGCTTTGG GTGCATTAAT AGGATACGCA 4080 

ATTATCGGAA TAATGGTATT TTTCTTAATG ACGTCACTTG GCX5AAATGGC TACGTATTTG 414 0 

CCAGTATCAG GTTCATTTAG TACATATGCT ACAAGATTTG TTGATCCATC TTTAGGGTTT 4200 

GCGCTTGGTT GGAACTATTG GTTTAACTGG GTAGTGACTG TAGCAGCAGA TATTACGATT 4260 

GCAGCACAAG TCATTCAATA TTGGACACCA TTGCAAGGCA TACCCGCTTC GGCATGGAGT 4320 

GCGTTGTTCT TAGTTATAAT TTTTAGTCTG AATTCGTTAT CAGTTCGCGT CTATGGTGAA 4380 

AGT(SAATACT GGTTGGCATT GATAAAAGTG GTTACA6TTA TTGTTTTCAT TGCAATTGGT 4440 

TTATTAACGA TTGTCGGAAT CATGGGTGGT CATGTT6TAG GATTCGAAAT ATTTAATAAA 4500 

GGTGAAGGTC CAATTCTTGG TGGCAACTTA GGAGGAAGTT TGTTATCAAT TCTAGGTGTA 4560 

TTCTTAATCG CTGGTTTCTC ATTCCAAGGT ACTGAGTTAA TTGGTATTAC GGCTGGTGAA 4620 

4S TCAGAAAATC CTGAACGTGC TGTGCCGAAA GCAATTAAAC AAGTATTCTG GAGAATTTTA 4680 

TTATTTTACA TTTTAGCCAT TTTTGTTATC GGTATGTTAA TTCCTTATGA TAGTAGTGCA 4740 

TTAATGGGGG GTAGTGATAA TGTAGCAACG TCTCCATTCA CATTAGTGTT TAAAAATGCT 4800 

50 GGATTTGCGT TTGCAGCATC ATTTATGAAT GCAGTCATTT TAACGTCTGT GTTA 4654 
(2) INFORMATION FOR SEQ ID NO: 107: 
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(A) LENGTH: 2488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
^ (D) TOPOLCXSY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

10 ATCAAAAATT GATTGTTTTC nATTTTTTGT TTCAGOGCGG GATCTTTTAC GTCTTTTGTG 60 

AAAACGaTTT TATTATTAAC TACTTTTACT GGATAACTTT TGTATGTCGA GTCAGTAGCA 120 

TTTTTTCTAT CGTTTGTAGT TQTGTCATAT TCACCAgTTA TTTTATGTGT GTTCTTATCT 180 

ACCTTTAACA ACATACGGTC TTCTTTTAAA AGCTCATCTG ATCCAACAAC TGAATAAGAG 240 

GATTCTATAT ACCATGTGTC TTGATCATTA TTTTCATAAT GGGGATTATC GTGACCATCA 300 

ATTTCATAAA GCGTTTCTAA GT'lTiTAATA GGATACGTAC TTAGTACTTT TTTAAGACCA 360 

20 

TCTTTCAAAT GAATTTGTTC CCACTTCATT GCCAAAAACA TATCGCCACT GACTACAATT 420 

GAAATAATAA TAATTGCTGC TAAGTTTAAC CAGAAAATTT TATGTGCTTT CATACATTCC 480 

CACCGTTTCT CAAAATACTT CATTAACACT ATAATAATAT ATTTTGAAAA ATATTTACAT 540 

25 

CAGTATTAAA GTGAATATCA AATTTTAAAT TTATGAAAAT AATAGATATT TATAAAAAGC 600 

GGAAAAGAGA TACAATAAAA AACTGCATGA CGTTTGAGAC GTCACACAGT GTAACTAAAA 660 

^ ATTTAAAAAG TTGTTGCTAA TTTTTCAGCA TTATTAATAC TAGTTGCTTT AATTTCTTCA 720 

GTCTTATGAG GTTCAGCATT GTGTCCTTCA ATAATGATTG TTTCATATGA TGGCACACCT 780 

AAGAATGTCA TAATTGTTCT TAAATAACGG TCACCCATTT CAAAATCAGC AGCAGGTCCT 840 

35 TCAGTATAAT ATCCACCACG TGATTGAATG TGTAATACTT TTTTGTCAGT TAGTAAACCT 900 

TGTGGTCCTT CAGCAGAATA TTTAAAAGTT TTACCT6CAA TTGAAATAGC ATCAATATAT 960 

GCTTTAACTA CAGGTGGGAA AGAAAGGTTC CACATAGGCG TTACAAATAC ATATTTATCT 1020 

40 GCACTTAAAA ATTCTTCTAA AATGTCACTC AATCTTGAAA CTTTCATTTG TTCATCATCA 1080 

GTTAACGTTT CGCCATTACT CATTTTTCCC CAACCAGTTA ATACATCTTT GTCAATAACT 114 0 

GGAATATAAG TTTCArATAA ATCAATATGT TTCACTTCAT CATCAGGATG TTGTTGTTGA 1200 

TATGTTTCGA TAAATGCTTT ACCAGCCGCC ATAGAATTTG ATACCAGTTC ATTAAAAGGG 1260 

TGTGCTGTAA TATATAATAC TTTTGCCATT TGAAAATTCT CCTCTGkTTC TGTTATTTTC 1320 

TTAAGTATAA TTATTATACT CGATATAAAA TTTAATATCA ATCAAAATAT TCAAATTACC 1380 

SO 

ATCATTTTCT TCATCTATAT nTGGCAGTAC TACTAAAGTA TGAGTGCATT TAATTATGAa 1440 

ATAGTTGATT TaGAATAtAT ACTTAATACC CAAAATATAT 6AAGGATGGA TGCCACTATG 1500 
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ATTATTTATA TAGATGACAT TCAAAAATGG TTTAACCAAT ATACCGATAA ATTGACACAA 


1620 




AATCATAAAG 


GACAAGGACA 


CTCAAAATGG GAAGACTTTT TTAGAGGGAG TCGGATTACT 


1680 


5 


GAGACTTTTG 


GTAAATATCA 


ACATTCACCA TTTGATGGTA AGCATTATGG CATTGATTTT 


1740 




GCATTGCCAA AAGGTACACC 


AATTAAAGCG CCGACGAATG GTAAAGTAAC ACGTATCTTT 


1800 


10 


AATAATGAAT 


TGGGCGGCAA 


GGTATTACAG ATTGCC6AAG ACAATGGAGA ATATCACCAG 


1860 


TGGTATCTAC 


ACTTAGACAA 


ATATAATGTC AAAGTAGGTG ATCGAGTCAA AGCAGGTGAT 


1920 




ATTATTGCAT 


ATTCAGGCAA 


TACAGGTATA CAAACGACAG GCGCACATTT ACATTTTCAA 


1980 


IS 


AGAATGAAGG 


GTGGCGTAGG 


TAATGCATAT GCAGAAGATC CAAAACCGTT TATCGATCAG 


2040 




TTACCTGATG 


GGGAACGTAG 


CCTATATGAT TTGTAGTTAT AGAAGGGT6C CCGCAGTCTA 


2100 




AAAAATTAAG 


CAATCATTGT 


GTGAGTATGA TACTTACATA ATGGTTGCTT TTTTCAATGA 


2160 


20 


AAATCGTAAT 


GCTAAGTCAT 


ACTTGTTTGA TTTAGATATT ACTTAAAATG TAAGACAAGG 


2220 




TTGTTAGCAT 


TGGCAGTGAA 


ATATCGCACA TAAAAAACAT TATTGTCACA CTAGAAAATA 


2280 


25 


GTTGTGCACT 


ATATCAATTT 


TCTGTATAAA AGTTTAATTC TGACAGTAAT GTAAACGTTT 


2340 


ACAATTTATG ATTGACATTA ATAATGACTG AATATATGAT TTATGTAAGT ATTTGTGCAA 


2400 




CGTTTTCACA AAGTGTATTG CACaAyCAAA CTGtAAACaA aGTATGGGGg GCCATAACAT 


2460 


30 


GGCAGAACTA AGTTAGAGCn 


TATTAAAA 


2488 




(2) INFORMATION FOR SEQ ID NO: 108: 




35 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 




40 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 






TTTTCTTTAT 


TTCAAmCTGT ATATTaATGA TGTCACTTCA TTTGATACGA TTCTTGATAA 


60 


45 


CCTATTCAAA ATTCCGCCAA ATAACATAAA TATTATATAA ATGCCGATAC TTTTAATCAT 


120 




TTTCTACTTT 


TTCTTCGATA 


CGGAAACTTG TTTTCGAATT GAACACTTCA CCAGCTTTTA 


180 




AAATTGACGG 


TGCTTTTTCA 


CCATATAAAT TAATATCATT TGGTAAAAAT TGTGTTTCTA 


240 


SO 


ATGTAAAGCC 


AGAATGTGGT 


TTATAAATAT TAAATGGACT ATCCCACTCA TCAGGCTGGT 


300 




TAAAAGTAAA 


GAACACAACA 


TGAGGCATAT CTGTATCGAC CTCTAACATA AATTCATGAT 


360 




TTTCAACATA 


CATTTTATGT 


TCACCAACTG TAAATGGGTG ATCGAGACCA CCAAAACGTG 


420 
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10 



TATCTTCAAA CACTTCATGT AAATCTAGAA TATCACCTGT AACAATATTT CGCTCATCTA 540 

ATACATACAT ATCTAATTGA TTACTTGAAA TGCGATGATT ATCAACGACA TTATTATCTC 600 

GATTCAAATT GAAGTACACA TGATTCGTAG GACTAAACAA TGTGTCTTCT GATGCAACTG 660 

CTTCGTATTC AATCGACCAT TGGTGATCCG CATCATAAAT ATGTGTAATC GTCACATCGA 720 

TATCACCCGG GAAATGATCA TCAGCTGATT TCAACACCX5T CTTAAATATA ACTTTAATTT 780 

GAGCAATTTC ATTTCTAATT TCATAATCAA ATAACTTATT GTCCAAACCA TGACATCCAC 840 

CATGTAAATG ATGTTCACCG TTGTTTTTTT CTAACTGATA TTCTTTACCT TTCAACTTAA 900 

IS ATTTAGCATT ATCAATTCTA CCGCTATATC TTCCTATAGA AGCACCAAAT TTAAAAGGAT 960 

TACTATGATa AAATTCATCC GCTTCAACAA CATTTCCAAG AACAATATTA TTATCATGAT 1020 

ATTTCCAAGA CACTACTCTT GCTCCATAAT TCGTAAAAAT AATTTTAGTT TCATCATTAT 1080 

CAATTTTGAT TAAATCTACA CCTTGTCTTT GGTGCTCAAC TTCAACTATC ATTTTTACTT 1140 

CTCCCTTCTA ACCACAAGTG TTCAAGCTCT GCTGGGTAGC AACATTACTA AAACACCTAC 1200 

AATACAAATG ATTGCACCGA TAACATCATA TTTATCTGGC ATTTGTTTAT CTACGACCAT 1260 

CGCAAAAATC AAACTCATGA TGATAAATAC GCCACCATAT GCTGCATATA CTCTTCCGAA 1320 

TGATGGAAAT GATTGAAATG TCGCAATGAC ACCATATAAC ATGAGTATCG CACCGCCTAT 1380 

TAGCCCAACA AGTGAAGACT GTCCTTCCCT AAGCCACAGC CAAATCAGGT ATCCCCCACC 1440 

TATTTCACAT AAGCCAGCTA ATATAAATAT AAAAATCGGA TATAACATGA AATCACTCCA 1500 

TCACACATTT GCTATCAATA ATCTATCGGC TACATATCAT TTGTTTACAT TTCTTCTTAC 1560 

35 TTCACATTCC CATTTTAAAA AGTTCGTTTT CACATTCATA TTGTACACTT TTTTAGACAT 1620 

TATTCTATAG CTAAATATAA AAAAATAAGA GTAACACGCT TTCATCATCA TTTTATATGA 1680 

TAAATGTGTG TCACTCTCAT CAATTTTATT TTTTAAATAC ACGTTTCATT GAATTAAATA 1740 

AGCCACGTTC AAATGTAAGT ACTGAATCTT TATATGTTTT AATTGCAATC CATATCAAGA 1800 

CAGCTACCAT TACAATTGAG ATTAAAGAAC TTAAGATGAC CTCATATATT TGAAGCCCTG 1860 

AAGTTTGAGC GCGTACAACT AATTGAAATG GCGCTAAAAA CGGAATATAA CTTGTGATTA 1920 

AAGCAAGTTG TCCATCAGGA TTATTTATCG TGAATATCGC GATATAAAAT GCAATCATAC 1980 

CAAGTAATGT CAGTGGCATC AAAGATTGAT TTAAATCTTC TATTCTAGAT GTTAATGATC 2040 

so CGAGGATGGC TGCAAGTAAT ACATACGCCG TAATTCCAAC AATACTACTT ATAATTCCGA 2100 

CAATAATAAT TTGCCAAGAC AATTGATTCA TTTCCACX3TT AAAACCTTGT AGCAAGTCTT 2160 

TTAAGTCAAA GGCAAAAATG CATATAACTG CCATCAATAC AATTAAAATA ATCTGAGTCA 2220 
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TAATAATCAT TTCAATGACA CGCGATGTTT TCTCACTAGC AATTTCCATA GCTATTTGAG 2340 

ATGCATAATT TAAAACAATG AAGAACATTA GAAAGATAAT GCCATmaGcT AAAGCATAGT 2400 

TGAAAATCTT TTGTCCTTCT GATACTTTAT CXJACTTCATC ATTAGAAATC ACCTTATTAT 2460 

CAACTTTACT TTGTGCTTGT AATTTTTGTA AGTCTTCTTT GTTGATATTT AATTCCCCGG 2520 

CTACCATATT TGTTTGAATA GCTGTAAGCA GTGCTTGTAC TTTTTGTGAA TCTTCATGAC 2580 

TTACTCGCTT CTCACTAATG ATTGTCCCTT GTAACGTGCG ATTTTGATTC ACCTTGATAA 2640 

TATAAGCTTT ATCAAGTTTA TGTTTTTTTA CTTCTTTTTC AGCATCTTCT ATAGAAACTT 2700 

TAGTAAACTT AGCATCACTA TGAAATGTAT TCX3CCTGTTG CTTGAAAACC TTATAGATTT 2760 

GTTCATTCGG TGCTGCTACA CCAATTTTAT CTGGACCATC ATCAAACATG TTAATAATCT 2820 

TATCAATGTT AGATAGGCCA ATCATTAAGG CAGCAATAAT AATCATAAAA ATTACAAATG 2880 

20 ATTTAGCTTT AATTTTTTTG ATATATGTCA AAGTAAATGT CGCCCAAAAC TTATGCATCC 2940 

TTGCCACCAA CCTTCTCAAT GAATATATCT TGTAATGATG GTTCTACAAC TTGGAATCGT 3000 

TTAACATAAC CTTGATGTGC CACAACTTGA TAAATATCTT TGGCTACGTC TTCATTCTCA 3060 

ATCGTCAACT GAAGACCTTG CTTCATGTTT TCACTATGAA TGATGCCTCT AATGTTTGTT 3120 

AAATCTGGTA GTGTTGTTTC TGATTCAATG ACAACTTTCT TGTTACCATT AGATGCACGT 3180 

ACATGATTGA TATCACCAGA AACAACAAGT TGACCTTTAT CTAAAATACA AACATCATCA 3240 

CATAATTCTT CAACATGCTC CATACGGTGA GAACTATAAA CGATTGTACT GCCCCAATCA 3300 

TTTAAGTCTT TAACTGCTTc TTTTAATAAC TCAACATTAA CTGGGTCTAG ACCACTGAAA 3360 

35 GGCTCATCTA ATATTAGTAA TTCTGGTTTA TGTAACATAC TTGCTAACAG CTGAATTTTT 3420 

TGTTGATTCC CTTTTGATAG ACTATCAATT CGTTTTTTGC GGTTTTCAGT AATATCAAAA 3480 

CGCTCAAGCC AATACGATAT TTGCTGTTGT ATTTCTGTTT TTGACATTCC CTTTAAAGTT 3540 

GCCAAATATT TCAATTCTTC TTCAACTGTC AATTTCCCAT GTAAACCGCG TTCTTCCGGT 3600 

AAATAACCAA TACGATTGTA CATTGTTTTA TCTAGTTTTT TACCGTTATA CGTrrTGTGT 3660 

CCTTCAGTTG GTTCACTTAA GCCTAAAATC ATACGAAATG TCX3TTGTTTT ACmTGCACCA 3720 

TTTCTTCCTA GAAAACCTAA CATTTTACCT GATTCTAACT TTAATGAAAT ATCATTTACT 3780 

GCC6TCATCT TGCCAAAACG TTTCGTAACA TGTTCAATTA CAAGTCCCAT ACTTTGCCTC 3840 

so CTAAAAAnAT ATGTATTTAT CTTAATATAA CATTTCCATT CTCTATAAAT GCAATATTTT 3900 

TAAAATGAAT TTATTTTTAA AATTTCTGAA ATTGAAAAAT TTAAATAGTG CCATTTTTGC 3960 

ATGTTAAGTA TCATTAGCAC TAGATATGTT TTTTCCATGC CTTTATTGCC TTATTTGTAA 4020 
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CTTnCCGGTG TTT 4093 
(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17846 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

TGCCAAACTA CCTTTTGACA GTCGTTGCTG TACTTCAGGA TGATCAATCA CATATnTTAC 60 

TTTAXCAAAT AGGGCATCTT CATCATTTTT AGTAATTAAA TAACCATTGA AATCTGAAGT 120 

AATCAGTTC6 TTAGGTCCAT ATTTAATATC ATAACTAATA ACTGGAACAC CATGTGCTAA 180 

AGATTCAAGT AGCGCTAAAG AGAAACCTTC CATGTTACTT GTTATTAAAC TCAAATAGGC 240 

ATCGCTATAT TCTTGGTCTA GATTGCTTAA AAAGCCGCGT AAGTAAACAT GATTTTCCAA 300 

TCCATATTTT TGTATCAATT CATTTAATTT TTTACTTTCA GAaCCAAAAC CATACATATG 360 

AaGCTCTATT TTTGGGACAT ACGATACTAA GCGTTTAATT AATTCAATTT GTTGATGTAA 420 

TTGTTTTTCA GGTGAATAAC GAGCAACGGA AATTAATTTA ACACTGCGCT GATCTAATGT 480 

TTGGACTGGT GTATCAATTG TTTCACTATA GCCGACAGGA ATATTAACAA CTGGAATAGT 540 

ATGGTTAATA CGTTTTTCAA CATCTAATTT TTGCTGCTCA GTAGAAACGA TAATTGCACG 600 

ATATCGAGAT AAATTTTCAA ACATCGCTTT ATATACATTT TTAAATGGCG ATGAATCTAA 660 

35 TGCATCAATA TTTTTAATGT GTGTACTGTG AAGCACAGCT ACTACTGGGA TTGACTCAGG 720 

CGTTAAGTTG AAAATAGGTG CTGTGTACAC ATTACOATCA CTGAAAAATA AATCCCCATG 780 

TTGATATAGT TGTTTAATGA AAAATGCGCC TAATTCCGTT TCATTATTAA AGAAATATTG 840 

TTTGTTAGCA TAGTAAACAA TAATTTTTTG TACTTCTGGT TTGCCATCCT TGTAAGAAAA 900 

ATACTTTTCT AATTTTGTGT CACCTTCTGG ATTATAGAAA AATTCACATA ATGTTTGTTG 960 

TTTATCAACA AGAATCCTAC TACAACTTAA AAAGCCACGC ACATCATAAA AATCACGTTT 1020 

TACTTtTCGT CTTTGACTAT CAAAATGATT TACATAATCT AATATAC6AT ATTTAGGATC 1080 

TTGAAAATGG GCATACATTA AGAAACGCTC TTGATCATAT ATTCTAAAGT CATCACTATT 1140 

so TTCAACATGT TTTAAAGTAT AATGACATTC ATCAGTCCAA TACGACAACC AGTCAAATGG 1200 

TTCATTGCGT TCTAAATATG TTGCTTCTT6 GAAGAAATCA TACATATTAA TATAGTCAGA 1260 

ACTAGTAATA TAATTTTGGG CATTTCTATA TAAATATCTA TTCCATGACA GAAATACACA 1320 
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CCCAGTTAAA TTAACACCTA AACTATTACC TACAAAATAA TTCATTTACA ACACCACTTA 1440 

TATCTATTTT TTATAATTAT ATCACATAAT ATTTAATTAC TTCTTTTAAC TGGAAGATGT 1500 

GTTTATTTAT AAAACAACAA ATTTTGATAT TTATAATGAT AGTAGTTATT CAATCAcTAC 1560 

GACCcAATAT ATCATkGTAG AGCTTAGGAT ATTGATTTAT GACTCAGGCA CATCAAATGa 1620 

GAgGATTTAT AAArGAGATA TACAACTCTA GAAGGTATAA TAAAAACX3CG CAACTAATGT 1680 

TACGCX5TTTG AATTAATCAT ATGATATTAT TTGCGATACT TTAATTTAGC GAAAgcATCA 1740 

TGTTGATGGA TAGACTCTTC ATTACGACAT TCGATATCGA AACCGTCTAA CCAATCAAAT 1800 

TCAACTAAGT CCGCXK3CAAT TAAACGAATT AAGTCTTCGA CAAAACGTGG ATTTTGATAT I860 

GCACGCTCTG TCACACGTTT TTCATCAGGA CGTTTTAAAA TAGGGTATAG AATTGAACTT 1920 

GCATTAGCTT CCATTGCATC TAAAATTTTA TTTTTATAGT CATCAACTAT GTCTTGATCT 1980 

20 TTATTAATAT ATGTTTTAAC AGTGACAACA CCACGTTGGT TGTGCX3CTGA ATACTCACTT 2040 

ATTTCTTTTG AACAAGGGCA TAGCGTTGTG ACAGTTGCTT CAATAGTAAG TTCTTTACGT 2100 

GTAnCTTTAT CACCGTCAAT TGCTAATCCA TAAGTGACAT CGGCATTACC AACTGCTTTA 2160 

ATATTTGTGG TTGGACTATA GCGATCAAAG AACCATTTCC CAGAAACATC AACGCCTGCC 2220 

GCATTTTGTT TCATATTCGT TTGTAAAGTG CX5TAACACCT GATAAAGTGT ATTAAATTCA 2280 

AGTTCAATAC CATTATCATA GTGCTTTTCA ACACTTTCGA TTATACGGCT CATATTAATA 2340 

CCTTTTTCGT CTTTTGTTAA ACTTGTTGAA AAACTAAATG TGCCAGCTGT TTGATACTGG 2400 

TCAACAAGTA CAGGGTACAC TAAGTTTTTA ATACCAACTT CTTCTATTTC AAATAAAAAA 2460 

TCTTTATGTG TACTTTGTAA ATCTGTCATT TCGTTCTTAG TAGTAGGTTT CX5TGCCTTCA 2520 

ATAGGATCTA CGGAACCAAA GTGTTTCCAA CGACCTTCTC GTGTCGATAA ATCAAATTCA 2580 

GTCMTTTTT TCCTCCGTTA AGATTTAAAG TGATATGTCC AATATGGTTC GACTGTTAAA 2640 

40 AAGCiTGTGTT GTTTACCATC GATTTCAGGA CTTGCTAATT GTTTTAAAAA TGGACCTGTT 2700 

TGAGAAGCAT GTGCTTCAAA TGCCTTAATT TTAAGTTCTT TAAAATCTGT AATATCATTT 2760 

TGAATATCAG GTTCTCCAAG AGCTTCGGTT GCATCATTAC TGAACGCAAC TAAAGTTAAA 2820 

CGAGGGCGTT CTTCTTTAGG CATGCX3TTCA ACCGTTCGAA TTACAGCGTC TGCTGTTGCT 2880 

TCGTGATCAG GATGTACTGC ATATCCAGGA TAAAATGAAA TAATCAATGA TGGATTTGTA 2940 

TCATCGATTA AAGATTTAAT CATACC7VTCT ATATGTTCAT AGGGTTCAAA TTCGACAGTT 3000 

TTGTCACGTA AACCCATTTT TCTTAAATCA GTAATACCGA TAACTTTACA AGCTTCTTCT 3060 

AGTTCACGCT CACGAATACT TGGTAATGAT TCGCGTGTT6 CAAATGGGGG ATTACCTAAA 3120 
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TAATTTGCTA ATGTGCCTGC AGATGAGAAG GTTTCATCAT CAGGATGTGG AAATATTACT 3240 

AATACATGTC TTTCGTCAGT CATGTTGATG CCTCCTCTAT AAATTAAATG GTCGCTCACT 3300 

5 

AATTTGAAGT GCTGCAGCGA GTTGACCTTC GTAATTAAAA CCTGCAATTA AAAATTCATC 3360 

ATGCTCATTG ACCTCAAAAT GCGTTAGACC TTGTACATAA ACCCAACCAC CATTTGATAG 3420 

TTTAAGACCA ATGCGATAAG GTTCTTTATT ACCACCTTTT AGTTGTGCAT GCGTATATGT 3480 

10 

TATTTGTATG TTTCTTAAAA AAGTACCAGC ATTAAAAACA CGTTGATCGA AATGGTTCGC 3540 

ATAGGCCCCA TTTGTC6TTT CAACATGCAG ATACACAGGT TTATGTTCAA AAGAAGCAAG 3600 

^5 TAAATCTATA ACTTCTTGTT CTTTAATTGG TTCCAACACG TTCACTCCTT ACACTATCAA 3660 

TGTGTTTATC TTTCTATTTT ACTAAAAACT ATTCXaVTAAT TGTATACQAT TGCTCAATTA 3720 

TTTATAAATT AATTTTCATG AAGGGTAATT ACTCAGGATT ACGTAATCAT ACAGCATTAG 3780 

20 TTTTTTACTT TTAAAAATCA AAAATTTGTT GGAATTTGAA AAGTGTTAAA CATTAAAAAT 3840 

GATGCTATAT TAATGGTGTA TGAATGAATT CATAAGTTTT TAAAATGTAT TAAATTTGTG 3900 

GAGGCATGTA AACAATGAAA GTATTAAACT TAGGATCGAA AAAACAAGCA TCATTCTATX5 3960 

25 

TTGCATGTGA GTTATATAAA GAGATGGCAT TTAATCAGCA CTGTAAACTA GGTTTAGCAA 4020 

CTGGTGGTAC AATGACAGAT TTGTATGAAC AACTTGTTAA GTTGTTAAAT AAAAATCAGT 4080 

TAAACX5TAGA CAATGTATCC ACGTTTAATT TAGACGAATA TGTAGGTTTA ACCGCATCAC 4140 

30 

ATCCGCAAAG TTATCACTAT TATATGGATG ACATGCTTTT CAAACAATAT CCTTATTTTA 4200 

ATAGAAAGAA CATTCATATT CCAAATGGAG ATGCCGATGA TATGAATGCG GAAGCGTgCA 4260 

55 AAATATAATG ACGTTTTAGA ACAACAAGGT CAACGTGATA TTCAAATTTT AGGTATTGGT 4320 

GAAAATGGTC ATATTGGATT TAATGAACCT GGTACGCCGT TTGATAGCGT TACTCATATC 4380 

GTTGffTTTGA CTGAAaGTAC TATTAAGGCT AATAGTCGAT ATTTTAAAAA CGAaGATGAT 4440 

40 GTTCCAAAGC AAGCCATTTC GATGGGACTT GCTAATATTC TTCAAGCCAA ACGTATCATT 4 500 

TTACTCGCAT TTGGTGAAAA GAAACGTGCT GCTATTACAC ATTTATTAAA TCAGGAAATT 4560 

TCTGTTGATG TTCCAGCCAC ATTACTTCAC AAACACCCGA ATGTTGAGAT ATATTTAGAC 4620 

45 

GACX3AAGCTT GCCCGAAAAA TGTTGCGAAA ATTCATGTCG ATGAAATGGA TTGATTGCAA 4680 

TGTTTAATTA AGAAAT6CCT CGGGAAAGGT TCCAATAGAA AGATAAAAAG CATTGGAAGG 4740 

^ ATGATTTTTA GTGGAATTAC AATTAGCAAT TGATTTATTA AACAAAGAAG ACGCGGCTGA 4800 

GTTAGCAAAT AAAGTAAAAG ATTATGTAGA TATCGTAGAA ATCGGTACGC CAATCATTTA 4660 

CAACGaAGGT TTACCAGCAG TTAAACATAT OOCAGACAAC ATTAGTAATG TAAAAGTATT 4920 
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CGCGGATGTA ATTACAATAC TAGGTGTTGC AGAAGATGCA TCAATTAAAG CAGCTATTGA 504 0 

AGAAGCTCAT AAAAATAATA AACAATTACT AGTTGATATG ATTGCTGTTC AAGATTTAGA 5100 
AAAACGTGCA AAAGAACTAG ATGAAATGGG TGCTGATTAT ATTGCAGTAC ACACTGGTTA 5160 

TGATTTACAA GCAGAAGGGC AATCACCATT AGAAAGTTTA AGAACCGTTA AATCTGTTAT 5220 

TAAAAATTCT AAAGTTGCAG TAGCAGGTGG AATTAAACCA GATACAATTA AAGATATTGT 5280 

CGCTGAAAGT CCTGATCTTG TTATTGTTGG TGGCGGAATC GCAAATGCA6 ATGATCCAGT 5340 

AGAAGCTGCG AAACAATGTC GCGCTGCAAT CGAAGGTAAG TAATATGOCT AAATTTAGTG 5400 

ACTATCAATT AATTCTAGAT GAATTAAAGA TGACTTTGTC ACATGTTGAA GCGGATGAGT 5460 

TTTCAACTTT TGCATCCAAA ATACTACATG CTGAACATAT ATTTGTAGCT GGCAAAGGAC 5520 

GTTCAGGATT CX3TGGCGAAT AGTTTTGCAA TGCGCTTAAA TCAGCTCGGC AAACAGGCAC 55B0 

20 ATGTTGTTGG AGAATCAACG ACACCTGCGA TTAAGTCGAA TGATCTATTT GTAATTATCT 564 0 

CTGGTTCAGG TTCCACGGAA CATTTAAGAT TATTAGCAGA CAAAGCAAAA TCAGTAGGTG 5700 

CTGACATCGT ATTAATTACT ACAAATAAAG ATTCTGCAAT AGGCAATCTA GCTGGGACGA 5760 

ACATCGTTTT GCCTGCAGGT ACAAAATATG ATGAACAAGG CTCGGCACAA CCATTAGGAA 5820 

GTTTGTTTGA ACAAGCATCT CAATTATTTT TAGATAGTGT TGTAATGGGA TTGATGACTG 5880 

AAATGAATGT TACGGAACAA ACGATGCAAC AAAATCATGC TAATTTAGAA TAAAATAAAG 5940 

ATAGTCGATA ATATGATGCC TAGGCAGAAA TATTATCQAT TATTTTTTTA TTTAAATAAT 6000 

AAATTATAGT ATAATATCAA TAATAAACGA ATAGGGGTGT TAATATTGAA GTTTGACAAT 6060 

55 TATATTTTTG AnTTGATGG TACGTTGGCA GACACGAAAA AATGTGGTGA AGTAGCAACA 6120 

CAAAGTGCAT TTAAAGCATG TGGCTTAACG GAACCATCAT CTAAAGAAAT AACGCATTAT 6180 

ATGGSAATAC CTATTGAAGA ATCATTTTTA AAATTAGCAO ACCGACCATT AGATGAAGCA 6240 

40 GCATTAGCAA AGTTAATCGA TACATTTAGA CATACATATC AATCTATTGA AAAGGACTAT 6300 

ATTTATGAAT TTGCGGGTAT AACTGAAGCC ATTACAAGTT TGTATAACCA AGGGAAAAAA 6360 

CTTTTCGTGG TGTCTAGTAA GAAGAGTGAT GTATTAGAAA GAAATTTATC GGCTATTGGA 6420 

TTAAATCACT TGATTACCGA AGCTGTTGGA TCCX3ATCAAG TAAGTGCATA TAAACCAAAT 6480 

CCTGAAGGCA TACACACAAT TGTGCAACGC TACAATTTAA ATAGCCAACA AACGGTGTAT 6540 

ATTGGTGATT CAACGTTTGA TGTTGAGATG GCACAACGTG CTGGTATGCA ATCTGCA6CT 6600 

GTCACTTGGG GTGCACATGA TGCAAGGTCA TTACTTCATT CAAATCCGGA TTTTATTATT 6660 

AATGATCCAT CAGAAATTAA TACCGTATTA TAAAACTTGT TAAAACAGAG AATACCATGG 6720 
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ATTTAAAATA AATATTTATT AAACATTATG AATTTTTAAA GAGTAATGTC TGACTCGTTG 6840 

ATAATTTATT TTTGTAAAAA TAAATTAAAG TAATGACAAA GTTATTGAAG TAAATTGAGT 6 900 

ATAAACATTT AAATACGATG TCGAAAATGG CGATAGCATA TCACTTACAT GAAGTTGTGT 6960 

GctATCGCTA TTTTTAGTTA TAATTCCAAA AAGTTAATCG TTCGATGATT TAAGAATTAT 7020 

TATTGTTTAA TTCAAATGTA TGAGGGTATA AAATCATTGA ATTTAATTCG ATAAAGCGAA 7080 

ATTTTTGAAC AAACATACTT TTGTATTTAT ATAAAAGTTT AAATTCTTAT AAATTTGACA 7140 

AAACTAATTA ACTCCGTATA ATTATGAAAC ATACAAGAGG GAGT6TATGA ATTCATGGAT 7200 

TTTAATAAAG AGAATATTAA CATGGTGGAT GCAAAGAAAG CTAAAAAAAC CGTTGTTGCA 7260 

ACCGGTATCG GTAATGCAAT GGAATGGTTC GATTTTGGTG TCTATGCATA TAcAACTGCG 7320 

TACATTGGAG CGAACTTCTT CTCTCCAGTA GAGAAT6CAG ACATTCGACA AATGTTGACT 7380 

20 TTCGCAGCAT TAGCCATTGC GTTTTTATTA AGACCAATTG GTGGTGTCGT ATTTGGTATT 7440 

ATTGGTGACA AATATGGACG TAAAGTTGTA TTAACATCTA CAATTATTTT AATGGCATTT 7500 

TCAACATTAA CCATTGGATT ATTGCCAAGC TATGATCAAA TTGGACTTTG GGCACCAATA 7560 

CTATTATTGC TTGCAAGAGT ACTACAAGGG TTTTCAACAG GTGGAGAGTA TGCGGGGGCA 7620 

ATGACATATG TTGCCGAATC ATCTCCAGAT AAGCGTCGTA ACTCATTAGG TAGTGGACTA 7680 

GAAATTGGGA CATTATCAGG TTACATAGCT GCTTCAATTA TGATTGCTGT ATTAACATTC 7740 

TTTTTAACAG ATGAACAAAT GGCATCATTT GGTTGGAGAA TCCCATTCTT ACTCGGTTTA 7800 

TTCCTAGGAT TATTCGGCTT ATATTTACGT CGTAAGCTGG AAGAATCACC AGTTTTCGAA 7860 

35 AATGATGTTG CAACACAACC AGAAAGAGAT AACATTAACT TTTTACAAAT CATCAGATTT 7920 

TATTACAAAG ATATATTTGT ATGTTTTGTA GCTGTTGTAT TCTTCaATGT TACAAACTAT 7980 

ATGGTAACTG CATATTTACC AACCTATTTA GAACAAGTTA TTAAATTAGA TGCAACGACA 8040 

ACAAGTGTAT TAATTACTTG TGTCATGGCA ATAATGATTC CATTAGCATT AATGTTTGGT 8100 

AAGTTAGCGG ATAAAATAGG TGAAAAGAAA GTATTTCTAA TTGGTACTGG TGGGCTAACA 8160 

TTATTCAGTA TCATCGCATT TATGTTATTA CATTCACAAT CATTTGTTGT AATAGTAATC 8220 

GGTATATTTA TATTAGGATT TTTCTTATCA ACTTACQAAG CGACAATGCC AGGGTCGTTA 8280 

CCAACGATGT TTTACAGTCA TATAAGATAT CGAACTTTAT CAGTAACATT TAATATCTCT 8340 

GTTTCGATAT TTGGTGGTaC GaCGCCATTA GTkGCAmCaT GGTTaGTTAC GAAAACTGGA 8400 

GATCCATTAG CmCCTGCGTA TTATTTAACA GCAATCAGTG TTATTGGCTT TTTAGTTATT 8460 

ACATTCTTAC ATTTAAGTAC AGCAGGAAAA TCTCTAAAAG GTTCGTATCC AAATGTAGAT 8520 



25 



30 



45 



50 



55 



631 



EP 0 786 519 A2 



10 



IS 



20 



25 



30 



35 



40 



45 



SO 



GAACGTAAGA ATTAGAGATT 
AGCTAGTAGG TTCTGCTAAC 
ATAAAGTTTT TGTATATACA 
GGGACTTAAA GCATATGTTT 
GGATGTAAAT ATGTCTTAGA 
CAATATTATT ATAGAGAACA 
TGCGATTGCG ATAACTTCTT 
CATGGTACTA CAGTATCAAA 
TTACCTGATA AAAATACTTA 
ATTGAGTGAG GGATATTGAT 
AAACCTAATG ACATAGCATT 
GCCCGCATCA CTAGCGCAgT 
ATTAATATGA AATCACCGGT 
AAAGTGCCTA TGATGATGGA 
AAATATGGTA TTAAAGATGT 
ATGTTTATTG ATTCAACGCA 
TCAGGGACAA CTGGACTGCC 
TTTGAAGTTA ATGAAATGTT 
CTATCGCACT CGTTAACATT 
ATAGGACAGA CCACTTTTCA 
TACAAAGTTG CTATGTTTCT 
AATGAACATA CAATCCAATC 
AAAAAGATAA AAAATCAAGC 
ACCAGTTTTA TCAGCTATAA 
TTTCCAAATG TGGAATTGAA 
ATAAAAAGTA ATATGATGTT 
TGGTTTGTTA CTAATGATAA 
CAACAGGATA TGTTAATTAT 
TTAACGCAAT CTTCGAGCAT 



TTAATaAAAA GTATAAATCA 
TTTAAAGTGC TTTTTAAATT 
TAAACCCCCA CTGCAATGAT 
AGCTTTGAAT ACTTAAAATT 
GTATTTTGTC CAACGCAATT 



CAAACTTAAA TAGATTGGGT 
TTCTCTATAT ACATATAGTA 
TTTAtCTAGG GCTTAAGTTT 
TTCATTATAT AATGTTAACA 
GAACGTAATT TTAGAACAGT 
ACATATCGAT GATGAAACAA 
TGAATCTTTG CAGAAATATT 
GCAAAGTATT ATTTGTTATT 
AGGTAAATGG CAAAGTACTA 
AATTGGAGAT ACAGGTCTCA 
ATTACAGCAC TACCCCAATT 
AAAAGCATAT TATCGTGATG 
GATGTTAAAA AATGAAAATG 
ATATGCGTTA TTGTTTGCTT 
TCCTGAAAAG TTACTTAATC 
TGTTCCAACG ATGATTAAAT 
ATTTTTTAGC AGTGGAGATA 
AAATGACATA AATTTGATTG 
CTTGAATCAG CAAGCACCAG 
AACAACGAAT CACX3ATCACA 
TAGTGGCTAT GTAAGTGAAC 
TGGCTATGTA AAAGAGCAGT 
TGGTGGTCAA AATATATATC 
TGATGAAGCA ATTATCATCG 



ATCGTATATA 
GAGAACTGTA 
TATCGCAATG 
CTCTTGCTAT 
AATATTGAGA 
GACTTATTTG 
ACGTCTTATC 
GATTTTTATA 
ATATGTATTT 
T6AAAACACA 
TTACATATAG 
CACTTAACCC 
TAGCTTTGCA 
TACATCGTCA 
TGCAGAATAT 
TATTACATAT 
AAGATTCATG 
CAATAGCAGC 
TAAGTTCCGG 
AATGTCATAA 
CATTATTGTT 
AGCTGCATTC 
AATTTTTTGG 
TTGAATCAGT 
ATGGTATAGG 
AATGTATAAA 
ATTTATATTT 
CAGCACATGT 
GTATTCCAAA 



AGCACTTTAA 
ATTAGCCGTA 
GGGGAAAGAG 
TGAAATGTTA 
CTCTAACCTT 
TGTCAGTTAT 
TAATAAAAAA 
ATAGGCAGGT 
TAAAGTTTAC 
TACTCAAAAT 
TCAACTAAAT 
TGTCGTTGCT 
TCGTTTACAT 
ATTGATTGAA 
AGACTCACCG 
TGGTTTTACT 
GTTGGCTTCT 
CCCTGGACCA 
TCGTACTTTT 
AATATCATCA 
AGTTTACAAC 
TTCTATTTTT 
TACATCGGAA 
AGGTGTGCTA 
AACTATTTGT 
TAATGATGAA 
AACGGGACGT 
TGAACGCCTT 
TGAGCX5TTTT 



8640 
8700 
8760 
8320 
8880 
8940 
9000 
9060 
9X20 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
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CAATTTTTAA AAAAGAAAGT 
AAGATGTATT ACACTGCAAG 
AGAGGTGAAT TATAATATGA 
GAAATATGGT GGCACTTTAA 
ACATTTTAAA GAGAAGTATC 
TGTTGTTGGG AATGGTGGCA 
TTCAATACCT GGCGTCACAA 
TGCATGTCGC ATGATCCAAG 
TACAAGTCGA GCACCTTGGA 
TGAGTTTTAT GAGCGTGCAT 
T6CTGAAAAT GTGGCCAAGA 
TCGAAGTCAT CAATTGACAG 
ACCTATAACC GTTAAAGGAG 
GAAAGATAAC TTTGGCCGAT 
TAGTTGTATG AAAAATGATG 
CGAATTAGGT TTCGAGCATG 
TAATTTTCCT GGCATTGGTC 
AACGATAGAA AATATTGAAG 
CT6CCAACAA GCTTTAAATA 
ATCAGGTCAT CCATACX^GTG 
TGAC^AAGAG ACTATGATTG 
ATTTACTCGA TTCTAACCAG 
ATTATCAGTG TTTTAACCAA 
TATGGCCAAA ATTTGATTTA 
CAGCAATTAA TCAAACTCAA 
ATATTGAATG CCGACAGACT 



AAAATGATCA ACATGTCATA 
GGAGTTTAAT GAGATATGGA 
ACATAATGAG ATTGTGCCAG 



GAAaCgnTaT 
tGGTaAAATT 
ATCAAGCAGT 
AACATTTAGA 
CAGAGGTAAT 
ATATTGCAAG 
TCGATCGGCA 
CCGGAGCTGG 
AAATCAAACG 
CATTTGCACC 
TGTATGATGT 
CGGAAAATGT 
AAATATTCAA 
TTAAGCCCGT 
GTGCAGTTTT 
GTTTATTATT 
CAGTACCAGC 
TCATTGAAAT 
TTTCAAATAC 
CAAGCGOTGC 
CATCTATGGG 
CGATTAAATG 
CCTTATAGAA 
TTTAAAAAAT 
AAGATAGAAG 
CGCAATATCA 
ACGGTtACAC 
TAAATGAATA 
GACAATTAGT 



GAAATTCCAT 
6CTAGAGAAA 
CATAGTTGCA 
GCCaGAACAA 
ATCTAAAATA 
AAAAGCATTG 
ATGTGGGTCT 
CAAGGTATAT 
ACCX3CATTCT 
TGAAATGAGC 
TTCAAGAGAA 
AAAGAAT6GA 
CACTGATGAA 
GATCAAAGGT 
ATTGCTTATT 
TAAAGATGGT 
CATTTCCAAC 
TAACGAAGCG 
GCAATTAAAT 
CCAATTAGTG 
6ATAGGGGGA 
TGTCATTTTC 
AAGAAGTACC 
ATGCAAATAG 
TAGACACAAT 
CACGTTATAC 
AAACTTTTAT 
TTTGGCGCTC 
GAGTCAAATG 



CGATGATTCA 
AAAT6ATGTC 
GCTAAACGAA 
TTGCTTAAAC 
GATGATGTAG 
CTTGAAGCGG 
GGACTTGAAA 
ATTGCAGGTG 
GTGTACGAAA 
GACCCATCAA 
TTACAAGATG 
AATATTTCTC 
AGTCTAAAAT 
GGGACCGTTA 
ATGGAAAAAG 
GTTACGGTAG 
TTACTAAAAA 
TTCAGTGCAC 
ATATGGGGTG 
ACTCGATTAT 
GGTCTAOGAA 
TAAGGATAGT 
ACCATTAATG 
CGAACTGATT 
ATATGTAGGG 
AATGGCTTTA 
TAAGGCGATG 
GTAAATGATG 
ATGCTGATGG 



TCATGTAGAA 
GATGTATTTG 
CTGCATTTGG 
CTTTATTCCA 
TTTTAGGTAA 
GGCTTAAAGA 
GTGTTCAATA 
GTGTTGAAAG 
CAGCATTACC 
TGATTCAAGG 
AATTTGCTTA 
AGGAAATATT 
CACATATTCC 
CCGCTGCGAA 
ATATGGCATA 
GTGTTGATTC 
GAAATC7VATT 
AGGTAGTTGC 
GTGCATTAGC 
TTTATATGTT 
ATGCAGCATT 
GTGGCTGCAT 
TGTGCGTCAT 
TTAACAAAAT 
CATTTAGAAG 
ACATTAACTA 
AAGTAGAGAT 
ATAATCCAAT 
CTATGTCATT 



10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10960 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 
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ATTCATTGAA CAACACGAAC ACGAAATTAT 
AAAAATTTCT TTGAGCACAA AAAAATAACC 

5 

GGAGATGAAA GGACAGCTAA TATCAGTTAT 
AATATAGGTT ACGTTTCTTT CTTTGCACGG 
CTATATCAAT GTTTAATAAA TTCTGGATTA 

10 

CATATGATCT ATATCGTCTT GTAATAAAGA 
TGAATCGTCA CATTTAATTG AAACATGCTG 

^5 TGCGCCTTCA TGGTGATACT GTCGATAAAT 

TCTATGGTTA TATTATAAAT AACATTTTTA 
TTATCAGACA TAGAACGTAT GATTTACTAA 

20 TATATATTTA TAGAGTCGCC TGGCAGTCAT 

GCATCTATCG CAAAAGAATG ATAATGATAG 
CATCTTGAAA ATAAAGGGTT ATTTAGTCAT 
TATTGTTCGA TATGTATGAA ATTTTCAATA 
AATTTAAATT ATATACAGAG CATGATGATT 
GTTCATACCC AATTTAAGTG GTGTGGCTAA 

30 

CGTTAAACCT CTGTTACTTC AACATCGATA 
ACAGGACCAA CAAAATCATT CATTTTCCAA 
TTCGCGCTAA TCACAGCTTC TTTCGGTGAC 

35 

GCAGCAAATG TACAACCAGC ACCATGGTTA 
TGATAAAATG TTTGACCATC ATAGTATAAG 

40 CCACCTTTAA TGATGACATG CTGTGOGCCT 

ATATCTTCAA TTGAATTTAA TTTACCTAAT 
GGTGTCACTA CCGTT G CTTT AGGTAGTAAA 

^ TTAAGCACTT CATCTTCGCC TTTACAAACC 

TTAGATGCCT CATATACTTC TCCAGCACGT 
GTTTTAATAG CATCAGGTCC GATTGATAAA 

SO 

ATTGGTAATG GTGTAACATC GTGTGACCAT 
AAAGCGACCA TGCCATACGT ATCTAATTCT 

SS 



AGCAATTAAT GACGATGGAG AGATTAAAAT 12240 

GATATTAGCT GCATGAACGC ATATTAATTA 12300 

GTATTGTTAT TATTATTGGG AACAGAGATG 12360 

GGATGCATTA ATCTAAAATA ATAATAACAA 12420 

TTGGAACGAT TAGTCAATTT AACTAACTTT 124 80 

GAGCAATTTG AATATTTCAG TATCACTAAA 12540 

AAACGTTTTG GTTATAATTT CATAAACTGG. 12600 

AATCATAACC TATATTACCT CCTTTGCTAC 12660 

TGTGTGACAT CAACCTTAA6 TATCAACTTT 12720 

GACTATTTAT GTATAAAAGT TCTAAATAAA 12780 

TTGGGaAATA TAACATATAT GATTAGAGAG 12840 

AGGTATTGAG CATATAGATG AGTTTAAGTT 12900 

AGATGTAGAT GTATAGGAAA TATTTGTATG 12960 

AAAGCTAATA ACGCTTATAT GTAACTTTCA 13020 

ATAAAAAAAT AACCACATCA CATAAATTGA 13080 

TAATGTTGAT TTATAGATGA ACCGCCTAAT 1314 0 

TGTTCAATAC GGTTGTATGC ACCGTGATCC 13200 

CCGTTTTTAA TAGCAGAAGC GACGAAAGCT 13260 

TTACCGTTAG CTAAATATGC AGTTGTTGCC 13320 

TAACTTTGTT GGAACATGTC TGTTGTTAGT 13380 

TCATACGATT TATCTTGATC TAAAGCTTTG 13440 

TTATCAAAGA TAATTGTTGC AGCCTTTTTC 13500 

CCTGATAATT GACCCGCTTC AAATAAGTTT 13560 

TATTTAATCA TCGCCTCAGT ATTTCCAGGA 13620 

ATGACAGGAT CTACTACAAA ATATTGTGCA 13680 

TTGATTATCT CCTCAGTACC TAACATACCT 13740 

GCCGTTTCAA GTTGTTTTTC AAATACATCC 13800 

GTATCTTTAT CCATAGTAAC GATGGCAGTT 13860 

TGGAACGTTT TCAAATCTGC TTGCATACcT 13920 
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CACTCCTACA TAATAATATT GTATTCATCA 
AGCATTCAAT ATTTGATGAT TGTTCAAATG 
^ TGTCATTCAC TTTAGATAAG TGTGATATGT 

AATGGTCGCA AATTTTTCAT GACATAACAA 
TTTTAGAAAA AGAATATTCG ACTGCAATCG 

10 

CGTTTGATTT AACACXX3TTT GAAAATATCA 
ATQGTCCAAA CCAAGCACAT GGATTAGCAT 
CATCTTTACG TAATATGTAT AAAGAATTAG 
CGCATTTACA AGATTGGGCA AGAGAAGGCG 
GACAGGGTGA AGCAAATTCT CATCGTGATA 

20 TTAAAGCAGT GTCTGATTAT AAAGAACATG 

AGCAAAAAAT AAAGCTTATC GATACATCTA 
GTCCACTGTC TGCATATAGA GGATTCTTTG 

25 ATTTAGAGTC AGTAGGAAAA TCACCAATTA 

ATAGAGAAAC TTTAATAGCA CGAATTGAGC 
ATGACCATGA CTTTGAAAAA CATATGTATG 

30 

CAACATCAAA TACACCACAT ATTGGTGAAC 
ATCAAATGCC ACAATCACAA ATAACGCAGC 
AAGCGATGGG TGGTAAAGTA AATAOGCATT 

35 

AACCTTCAAA CCAACAACAA AGATTAGCGA 
TATttGATTT TTAAAAAGCA ACAATGAAAC 
40 GGTTAATAAT CAAQACGCAT ATACTTTTAT 

ACTGAATTAT ATAAGGAGAG GTAGCAATGA 
CGATGATGGC TGTCGGTACA GGTGCATTTG 
ATCACTATTT ATCAGTATGG GAAAAAOCAA 
TATTAATTAT AGGTGTAATT AGTGGTACAA 
TAATATTTGC TGGTATTATT TTCTTTAGTG 

SO 

TTAAAGTTTT AGGTGCGATT ACGCCAATTG 
TGTTAATCAT TGCGACATTC AAATTTGCTG 

55 



TATCATTTTT AACCTAATTG AAAAATATTA 1404 0 

AATCATTCAT ACTATTGTAA CTTTTGAAAA 14100 

TAAAATATGT CCTGAGGTGA GATTGAATGG 14160 

CX3AAACATGA CTTTAAAGCT ATGCATGATT 14220 

TATACCCTGA TAGGGAAAAT ATATATCAAG 14280 

AAGTTGTTAT ATTAGGACAA GACCCGTATC 14340 

TTTCAGTGCA ACCTAACGCA AAATTCCCTC 14400 

CAGATGATAT TGGATGCGTT AGACAAACAC 14460 

TCTTGTTATT GAATACAGTT TTAACCGTAA 14520 

TTGGTTGGGA AACATTTACT GATGAAATTA 14580 

TTGTCTTTAT TTTGTGGGGG AAACCTGCAC 14640 

AACATTGTAT TATAAAATCA GTGCATCCTA 14700 

GATCAAAACC GTATTCCAAA GCGAATGCCT 14760 

ATTGGTGTGA AAGTGAGGCG TAGATGTTGA 14820 

AAGAATTAGT ACAAGCAGAG CAGGCACAGC 14 880 

CCATACATAT ATTAACATCT TTATATGCTT 1494 0 

AACAAATGAA TCGTCGTATT GCTAACCATA 15000 

CAACTCATCA AGTGACAGTT GCTGAAATTG 15060 

CAGCACATCA TCATAATAAG TCATATTCAC 15120 

CAGATGATGA CATTGGCAAT GGTGAATCCA 15180 

ATAATTACTT AATAGCTTGT TAAGTATGTA 15240 

TC6AGTGTTC GGATTTAAAC ATTTATTAAT 15300 

AATTATTTAT TATTTTAGGT GCATTAAACG 15360 

GTGGGCATGG TTTACAAGGA AAAATAAGTG 15420 

CGACXTTATCA AATGTACCAT GGCTTAGCAT 15480 

CTTCAATCAA TGTTAACTGG GCTGGCTGGT 15540 

GATCATTATA TATTTTAGTA TTAACTCAAA 15600 

GTGGCGTATT GTTCATCATT GGATGGATAA 15660 

GTTAAATTTT AAAACTTTAG ATTACCTATG 15720 
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TGGGTATAGA ATACCTTCGA GGTGAGTTTT 
ATAGAGGCGA TTTAAAACAA AACCTATCTG 

5 

CATGTATCGG ATGGGGCGCA TTCATCTTAC 
TTGCAGCATC AATTGGTATA GTTATTGGTG 
ATGGCGCATT AGTAGAGAGA TTTCCAGTAT 

10 

GTTTCGGCAG ATATGTGAGT TTCTTCTCAT 
TCGTTGCTTT AAAtGCGACC GCATTCAGTT 
TAAATAATG6 GAAACTATAC ACCATTGCGG 
TTGCGACCGT ATTACTACTT GTATTCATGC 
GATCATTACA ATATTATTTC TGTGTGGCGA 

20 GTTCATTCTT TGGTAATAAT TTTGCACTTG 

AAGGATGGTT AGTGTCTATT GTGGTTATTG 
TTGATAATAT TCCACAAACA GCAGAAGAGT 

25 TTATCGTGTA CAGTTTATTA GCAGCATCAT 

GTTGGTTATC AACAAGTCAT CAAAGTTTAA 
CACAAACAGC ATTTGGTTAT ATTGGATTAG 

30 

TATTTACTGG TTTAAATGGA TTCTTGATGA 
GTTCAGGTAT TATGCCAACA ATGTTTAGTA 
TCGCAATCAT ATTCCTAGTA GGAGTGTCGT 

35 

TGACTTGGAT TGTAGATATG TCATCTACTG 
TGTCSx;CAGC GAAATTATTC AGTTATAACA 
40 AAACGTTTGC TATTATCGOC TCATTTGTAT 

CAGGTTCTCC TGCAGCACTG ACTGCACCGT 
TCGGTTTAAT ATTCTTTGTG ATTCGATATC 
TAAGTCGCTT GATTTTAAAT AGAAGTGAAA 
AAAAAGAAAA AACTAAATAA TAAAAGAATC 
ATCGTGCGAT TTTTTGTATT ATAAATTGAC 

SO 

TAATTGCTAA GAGTTAGGGC TGAGCCATTT 
TTCACX3AACC CAGAAACAAT TAATTTGGAA 

55 



TATTTATGGA AAAAAAGAAT AAGCAAATAG 15840 

AAAAGTTTGT ATGGGCGATT GCATATGGTT 15900 

CAGGAGACTG GATTAAGCAG TCAGGTCCGA 15960 

CATTATTAAT GATATTAATT GCGGTTAGTT 16020 

CAGGGGGCGC GTTTGCCTTT AGTTTCTTAA 16080 

CATGGTTTTT AACTTTTGGT TATGTCTGTG 16140 

TACTAGTTAA ATTCTTATT6 CCAGATGTCT 16200 

GCTGGGACGT TTATATTACG GAAATCATTA 16260 

TAGTAACGAT TCGTGGCGCA AGTGTATCTG 16320 

TGGTAATCGT CGTATTATTG ATGTTCTTTG 16380 

AAAATTTACA ACCGTTAGCT GAACCTAGCA 16440 

TATCCGTOGC ACCATGGGCA TATGTTGGAT 16500 

TTAACTTTGC ACCAAACAAG ACATTTAAGC 16560 

TAACTTATGT TGTCATGATT TTATACACTG 16620 

ATGGGCAGTT GTGGTTAACA GGTGCTGtTA 16680 

GTGTATTAGC AATTGCAATT ATGATGGGTA 16740 

GTTCAAGTCG CTTGTTATTT TCTATGGGAC 16800 

AATTACATAG TAAATACAAA ACACCATATG 16860 

TAATTGCACC TTGGCTAGGA AGAACTGCAT 16920 

GTGTATCCAT TGCCTACTTT ATTACATGTT 16980 

AACAAAGTAA TACGTATGCA CCGGTTTACA 17040 

CATTCATTTT CTTAGCGTTG TTATTAGTGC 17100 

CTTATATTGC ATTACTTGGA TGGTTAATCA 17160 

CTAAATIGAA AAATATGGAT AATGATGAAT 17220 

ATGAAGTTGA TGATATGATT GAAGAACCTG 17280 

GCACAATAAA CCTTCTTCAT TCGGAGGCGT 1734 0 

ATTTAAGACG AGGCAGCTGA ACCTTATATA 17400 

CTAACAAATA TTTATAATCG TTTAAAAGAT 174 60 

ATTTGGTCGG CGAATAATAA ACCTAATGCG 17520 
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AAGACTAAAT TTTTTGTAGC ATCGTATGCT AAGCCACCAG GTACTAATGG AATGATACCC 17640 

GTTACCATAA AAATGATGGC AGGTTCTTTT TGTTTACGAG CCATATAATG ACTTAACAAG 17700 

CCTAATGCTA AACTACCAAA GAAACTAGAG TATATAGTGT GCACATTAAA GCCGTTGAAG 17760 

AATAAGGTGT AAACCATCCA TCCACACGTA CCAACGAAAC CACATGATAG ATATAATTTT 17820 

CTAGGTGCAT CAAAAATGAC GCAGAA 17B46 



10 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5544 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



25 



30 



35 



40 



ATTGACACTT 


GGTGAAAGTA 


ATATCGCCGC 


GCTATTTTGG 


CAAAATGGAC ACTTAGAACC 


60 


TGAGTTACAA 


GATGAACAGC 


CAATTAATAT 


ATTAGGATCT 


GkTCAAATCA ACGAATGGAA 


120 


TGGTAATCAA 


TCACCGCAAA 


TAATTATTCA 


AGATATTGCG 


ATGAATGAAC AGCAAATATT 


180 


AGATTATAGA 


AGTAAGCGAA 


AAAGTTTACC 


TTTTACAGAA 


AATGATGAAA ATATTGTCGT 


240 


GCTTATTCAT 


CCTAAAAGTG 


ATAAAGTAAA 


TGCGAATGAA 


TATTATTATG GTGAAGAAAT 


300 


TAAACAACAA 


ACTGATAAAG 


TAGTATTAAG 


AGATTTACCA 


ACGTCAATGG AAGACTTGTC 


360 


TAATTCCTTG 


CAACAACTGC 


AATTTTCTCA 


ACTTTATATA 


GTTTTGCAAC ATAATCATTC 


420 


GATTTACTTC 


GATGGTATAC 


CTAATATGGA 


TATTTTTAAA 


AAGTGTTATA AAGCATTAAT 


480 


AACTAAACAA 


GAAACAAATA 


TCCAGAAAGA 


GGGTATGTTA 


TTGT6TCAAC ATTTAAGTGT 


540 


GAAAiCCAGAT 


ACACTTAAAT 


TCATGTTGAA 


A6TTTTCTTA 


GACTTAAAAT TTGTAACACA 


600 


AGAAGATGGT 


TTAATTCGAA 


TCAATCAACA 


ACCTGATAAA 


AGATCGAITG ATTCCAGCAA 


660 


AGTATATCAA 


TTAAGACAAC 


AACGTATGGA 


TGTTGAAAAG 


CAATTATTAT ATCAAGATTT 


720 


TTCAGAAATA 


AAAAATTGGA 


TAAAGTCACA 


ATTGTCGTGA 


GCAATTTAGG AGGAAATATT 


780 


AATGGATTTA 


AAGCAATACG 


TATCAGAAGT 


TCAAGATTGG 


CCGAAACCAG GTGTTAGTTT 


840 


CAAGGATATT 


ACTACAATTA 


TGGATAATGG 


TGAAGCATAT 


GGCTATGCAA CAGATAAAAT 


900 


TGTAGAATAC 


GCAAT^GACA 


GAGATGTTGA 


TATCGTTGTA 


GGACCTGAAG CGCGTGGCTT 


960 


TATCATTGGC 


TGTCCTGTAG 


CTTATTCAAT 


GGGGATTGGC 


TTTGCACCTG TTAGAAAAGA 


1020 


AGGGAAATTA 


CCTCGTGmAG 


TCATTCGTTA 


TGAGTATGAC 


CTAGAATATG GTACAAATGT 


1080 



55 
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ATTAGCTACT GGTGGTACGA TTGAAGCAGC 
CGTAGTAGGT ATTGCATTTA TAATTGAATT 
^ AGATTACGAT GTTATGAGTT TAATCTCATA 

AATGAAATCC TTCATCAAAT GTATAAGAAC 
TTTCTTAACA TGAGATGTTA GGATTTTTTA 

10 

ATACCTTAAT AACATC6TTT ATTTATTTCA 
AAAAATGAAA CAGTAGATTT AGGTCGAATT 
TACAAATTAA ACTCGCTCAA GTAAAATTAA 
TTATCX3TCGA CGGACGTATG ATTGGTGTGG 
TCATTGTTTA AGGCGAAGTA ATAAATATGA 
20 ATCCATATAG TGCAGACGAA tTCTTCACAA 

TGAGTATGTT TTAAAAAGCT ATCATATTGC 
AAACGGATTA CCATACATTA TGCATCCTAT 
ATTAGACGGA CCGACGATTG TCGCAGGTTT 
TACATTTGAA GATGTAAAAG AAATGTTCAA 
GACGAAGCTT AAAAAAGTAA AATACCGCTC 

30 

CAAGTTATTT ATTGCGATTG CCAAAGATGT 
ATTACATAAT ATGCX3TACCT TGAAAGCCAT 
AGAAACATTA GAAATTTATG CACCATTAGC 

35 

GGAACTAGAA GATACGGCTC TTCGTTATAT 
TTTA&TGAAG AAGAAACGTA GTGaACGTGA 

40 ACGTACTGAA ATGGACCGAA TGAATATCGA 

TTACAGTATT TATCGGAAAA TGATGAAGCA 
GTTGGCGATA CGTGTTATTG TCAATTCTAT 

^ GCATACGTTA TGGAAACCGA TGCCAGGACG 

AAATTTGTAT CAGTCATTGC ATACTACAGT 
CCAAATACGA ACGTTTGATA TGCACGAAAT 

SO 

TTACAAAGAA GGTAAAAAAG TAAGTGAAAA 
GTTAAAAGAA TTAGCTGAAG CGGATCATAC 

55 



AATAAAATTA GTTGAAAAAT TAGGCGGTAT 1200 

GAAATATTTA AATGGTATTG AAAAAATTAA 1260 

CGACGAATAA TAAATAATAT AATTTTATCA 1320 

CAATGACTTA ATTAAAAAAG TTGTTTAAGT 1380 

TTTACTGAAA ATGTTAGATG ATTGAGCATT 1440 

TAAATTGTAG TATCATAGAA CTAATATTTA 1500 

TTTGTAAAAG TTTTAAAAGT AGGAATAGTA 1560 

TATTACGATT AATGACGACA GGATAAATAT 1620 

GACAAATACT ATTCAACAAG AGTACCTAAA 1680 

ATGGGGTGTA TCATATAATG AACAACGAAT 1740 

AGCAAAATCA TATTTGTCAG CAGATGAATA 1800 

TTATGAA6CA CATAAAGGTC AGTTCCGAAA 1860 

ACAAGTTGCA GGTATTTTAA CAGAAATGC6 1920 

TTTGCATGAT GTAATTGAAG ATACACCGTA 1980 

TGAAGAAGTT GCTCGAATTG TTGATGGTGT 2040 

AAAAGAAGAA CAACAAGCTG AAAATCATCG 2100 

ACGCGTAATT TTGGTGAAAT TAGCAGACAG 2160 

GCCGCGCGAA AAACAAATTA GAATTTCTCG 2220 

ACATCGTCTT GGTATTAATA CAATCAAATG 2280 

TGATAATGTG CAATATTTTA GAATAGTCAA 2340 

AGCGTATATC GAAACGGCTA TTGATAGAAT 2400 

AGGCGATATA AATGGTAGAC CTAAACATAT 2460 

GAAAAAACAA TTTGATCAAA TTTTTGATTT 2520 

TAATGATTGT TATGCGATAC TTGGGTTGGT 2580 

TTTTAAAGAT TATATTGCAA TGCCTAAACA 2640 

AGTAGGCCCA AATGGAGACC CGCTCGAAAT 2700 

TGCTGAGCAT GGTGTTGCAG CACACTGGGC 2760 

AGATCAAACT TATCAAAATA AGTTAAATTG 2820 

ATCGTCTGAC GCTCAAGAAT TTATGGAAAC 2880 



638 



70 



75 
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TGAGTTGCXyV TATGGTGCTG TGCCGATTGA TTTTGCTTAT GCGATTCACA GTGAAGTAGG 3000 

TAATAAGATG ATTGGTGCCA AGGTGAATGG CAAAATTGTA CCAATTGACT ATATTTTACA 3060 

AACAGGCGAT ATTGTTGAAA TACGTACTAG TAAACATTCA TATGGACCAA GTCGTGATTG 3120 

GTTGAAAATT GTTAAATCGT CTAGTGCCAA AGGTAAAATT AAAAGTTTCT TCAAAAAACA 3180 

AGATCGTTCA TCTAATATTG AAAAAGGCCG AATGATGGTT GAAGCTGAAA TAAAAGAGCA 3240 

AGGATTTAGA GTCGAAGATA TTTTGACAGA GAAAAATATT CAGGTTGTTA ATGAAAAATA 3300 

TAACTTTGCA AATGAAGATG ATTTATTCGC AGCTGTAGGA TTTGGCGGCG TGACATCCTT 3360 

ACAGATTGTT AATAAATTAA CTOAAAOACA ACGTATTTTA GATAAACAAC GTGCTTTAAA 3420 

TGAAGCACAA GAAGTTACGA AATCATTGCC TATTAAAGAC AACATCATTA CTGATAGTGG 3480 

TGTCTATGTA GAAGGTTTAG AAAATGTACT TATCAAGTTG TCAAAATGTT 6TAATCCTAT 3540 

20 ACCaGGTGAT GATATTGTAG GTTATATCAC CAAAGGTCAC GGTATTAAAG TACATCGCAC 3600 

TGATTGCCCA AATATTAAGA ACGAAACTGA ACGACTAATT AATGTTGAAT GGGTAAAATC 3660 

AAAAGACGCA ACTCAAAAAT ATCAGGTTGA TTTAGAGGTA AtGCGTATGA CCXSAAATGGC 3720 

25 TTGTTGAATG AAGTACTACA AGCTGTTAGC TCGACAGCCG GCAATTTAAT TAAAGTTTCA 3780 

GGACGTTCAG ATATTGATAA AAATGCAATA ATAAATATTA GTGTCATGGT GAAAAACGTG 3840 

AATGATGTTT ATCGTGTGGT AGAAAAGATC AAACAACTTG GTGATGTTTA TACAGTAACA 3900 

AGAGTTTGGA ACTAGAGGTG CAAAATATGA AAGTAGTTGT ACAAAGAGTT AAAGAAGCAT 3960 

CGGTGACGAA TGATACATTA AATAATCAAA TCAAAAAAGG ATATTGTTTA TTAGTCGGTA 4020 

TCGGTCA6AA CTCTACA6AG CAAGATGCAG ATGTAATTGC AAAGAAAATT GCTAATGCAA 4080 

GATTATTTGA AGATGACAAT AATAAATTAA ACTTTAATAT CCAACAAATG AATGGTGAAA 4140 

TACI^TCAGT TTCACAATTT ACTCTCTAT6 CAGATGTAAA AAAAGGTAAC CGTCCAGGTT 4200 

TCrCAAATTC TAAAAATCCT GATCaAGCGG TAAAAATTTA TGAGTATTTT AATGcaTGCG 4260 

CTACGAGCGT ATGGTCTTAC TGTGAAAACA GGTGAATTTG GAACACACAT GAATGTTAGC 4320 

ATAAATAATG ATGGTCCAGT CACTATTATT TATGAAAGTC AGGACGGCAA AATTCAATGA 4380 

45 AAAAAATAGA GGCATGGTTA TCTAAAAAGG GTCTTAAAAA TAAACGTACT CTAATAGTAG 4440 

TGATTGCCTT TGTCTTATTT ATCATCTTTT TATTTTTATT GCTGAATAGC AATAGTGAAG 4500 

ATAGTGGGAA CATCACGATA ACTGAAAATG CTGAATTACG TACAGGTCCA AACGCTGCGT 4560 

SO 

ATCCAGTCAT ATATAAAGTT GAAAAAGGTG ACCATTTTAA AAAGATTGGT AAAGTAGGTA 4620 

AATGGATTGA AGTTGAAGAT ACATCCAGTA ATGAAAAAGG TTGGATAGCT GGATGGCACA 4680 

55 



30 



35 



40 
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TAGTGCTTGA TCCTGGTCAT GGAGGTAGTG ACCAGGGTGC TTCAAGCAAT ACTAAATATA 4800 

AAAGTTTAGA AAAAGATTAT ACGTTGAAAA CAGCAAAAGA ATTGCAGCGT ACTTTAGAAA 4860 

AAGAAGGCGC AACTGTTAAG ATGACAAGAA CAGACGATAC ATATGTTTCA CTAGAAAATC 4920 

GTGATATCAA AGGCGATGCC TATTTGAGTA TACATAATGA TGCGTTAGAA TCATCTAATG 4980 

CAAATGGAAT GACaGTTTAT TGGTATCATG ATAATCAAAG AGCTTTAGCA GATACGTTAG 5040 

ACGCTACGAT TCAGAAGAAA GGTCTACTTT CTAATCGCGG TTCAAGACAA GAAAATTATC 5100 

AAGTGTTAAG ACAAACAAAA GTTCCTGCTG TTTTATTAGA ATTAGGTTAT ATTAGTAACC 5160 

CAACTGATGA AACGATGATT AAAGATCAAT TACATAGACA AATTTTAGAA CAAGCAATTG 5220 

TTGATGGCCT TAAAATTTAT TTTTCTGCGT AGGGCTTGCA AAAATATGTG AAAGTAGTTA 5280 

TCATTGATAT TGAATTTTAT AACTAAAACC . GTTAGTATTC TTGAAATGGT AAATGAAATA 5340 

20 GGTAGCAATC TAACTAAGAT TGTGTAGGAA TATAATCCAT AGACTGAAAG ATTAT6CTGA 5400 

GTAGTTTATA TACATTGAAC ACAAGAAGAG GTGCTTTATG AAAAGTAAAG CCGTTAAACG 5460 

TACGTTaAAC GTTTTGAGTG GGTTTATTAA ATGCACGCTT ATAAAAAGTA ATGATGATTA 5520 

C/^TTAGGCA TGTTTTTTAA ACCA 5544 
(2) INFORMATION FOR SEQ ID NO: 111: 



10 



IS 



30 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1067 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dovible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

AAAAGATTGC AAATATAAAT GGCATGTTTA ATATGTTAGA ACAACAAATC ATTCATAGCC 60 

40 AAGATATGGC TCATTTTAGA AGTGAATTTT TTTACGTCAA TCATGaGCAT CGAGAAAACT 120 

ATGAAgCACT CCTAATTTAT TACAAAAATA GTATCGACAA TCCTATTGTA GATGGTGCAT 180 

GTTATATTTT AGCCCTACCT GAAATTTTCA ATAGTGTTGA TGTTTTCGAA TCAGAGTTAC 240 

45 CATTTTCATG GGTATATGAT GAAAATGGCA TTACCGAAAC AATGAAATCA CTTAGCATTC 300 

CATTACAATA TTTAGTTGCA GCAGCTTTAG AAGTAACTGA TGTGAATATA TTTAAGCCTT 360 

CAGGATTTAC AATGGGAATG AATAATTGGA ATATTGCTCA AATGCGAATC TTTTGGCAAT 420 

SO 

ATACAGCAAT TATTAGAAAA GAAGCACTAT AACATTAATA ATTAATTAGC TATAAAGATG 480 

ATTCACAACA ATCATCTTTA TAGCTTTTTT ATGTCTAATT ATTTTTGAGG AAAATinACAA 540 

55 
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AATTTTATGT TTTCAAAAGT AAACAATCAA AAGATGTTAG AAGATTGCTT CTATATAAGA 660 

AAGAAAGTGT TTGTAGAAGA ACAAGGOSTC CCTGAGGAAA GTGAAATTGA TGAATATGAA 720 

^ TCTGAATCTA TTCACCTCAT TGGATATGAT AATGGACAGC CAGTTGCCAC TGCTCGAATA 780 

CGCCCTATTA ATGAAACAAC TGTCAAAATA GAACGAGTAG CTGTGATGAA ATCACATCGT 840 

GGACAAGGAA TGGGTAGAAT GCTTATGCAA GCTGTAGAAT CATTAGCTAA AGATGAAGGT 900 

10 

TTTTACGTAG CTACTATGAA TGCCCAATGT CATGCTATCC CATTTTATGA AAGTTTAAAC 960 

TTTAAAATGA GAGGTAATAT ATTTCTTGAG GAAGGCATCG AGCATATTGA AATGACAAAA 1020 

AAGTTAACCT CGCTTAATTA AAAAAAGTTG TATCTATTTT AGAAACA 1067 

75 

(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18613 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 



30 



35 



40 



SO 



AAGACGtAtG 


ATAACAACAA 


TACgTGTAGT 


GAAAGATTTT 


AATCTACATA 


TTACTGACAA 


60 


AGAATTCATT 


GTATTTGTTG 


GACCATCGGG 


ATGTGGTAAA 


TCAACAACAT 


TACGAATGGT 


120 


TGCTGGACTA 


GAGTCTATCA 


CATCTGGAGA 


TTTTTATATT 


GATGGGGAAC 


GCATGAACGA 


180 


TGTTGAACCA 


AAGAATAGAG 


ATATTGCGAT 


GGTATTTCAA 


AACTATGCAT 


TATATCCACA 


240 


TATGACTGTT 


TTTGAAAATA 


TGGCATTTGG 


GCTAAAGCTA 


CGTAAAGTAA ATAAAAAAGA 


300 


GATTGAACAA 


AAAGTTAATG 


AAGCAGCTGA 


AATATTAGGA 


TTAACTGAGT 


ATCTTGGTCG 


360 


TAAACCAAAA 


GCGTTATCTG 


GCGGACAGCG 


TCAACGTGTT 


GCTTTGGGCA 


GAGCTATTGT 


420 


TAGGGATGCG 


AAAGTCTTTT 


TAATGGATGA 


ACCATTATCG 


AATCTTGATG CGAAyTtCGA 


480 


GTACAAATGC 


GCACAGAAAT 


ATTGAAATTA 


CATAAGCGAC 


TTAATACTAC 


GACAATTTAT 


540 


GTTACACATG 


ATCAAACTGA 


AGCATTGACG 


ATGGCTAGTC 


GAATTGTTGT 


TTTGAAAGAT 


600 


GGCGACATTA 


TGCAAGTCGG 


CACACCTAGA 


GAAATATATG 


ATGCCCCTAA 


TTGCATATTT 


660 


GTGGCGCAAT 


TTATCGGCTC 


ACCAGCAATG 


AATATGTTGA 


ATGCTACAGT 


TGAAATGGAC 


720 


GGATTGAAGG 


TAGGAACACA 


CCATTTTAAA 


TTACATAATA 


AAAAATTTGA AAAGTTAAAA 


780 


GCTGCTGGCT 


ACTTAGACAA 


GGAAATTATT 


TTAGGTATTC 


GAGCTGAAGA 


CATTCATGAA 


840 


GAACCAATAT 


TTATTCAAAC 


TTCTCCAGAG 


ACACAATTTG 


AATCTGAAGT 


AGTTGTATCC 


900 



55 
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AAATTAGATT CAAGAACTCA AGTGATGGCG 
AATAAGTGTC ACTTTTTTGA TGAAAAAACA 
^ ATGTCTAAAA TTTTAAAATG TATCACGTTA 

TGTGGCCCTA ATCGTTCGAA AGAAGATATT 
GACAAGCCTA ACCAACTTAC GATGTGGGTG 

10 

AAAATTACGG ATCAATATAC TAAAAAAACT 
CAAAATGATC AACTAGAAAA TATTTCGCTA 
TTTTTCTTAG CACATGATAA TACTGGAAGT 
AAATTATCAA AAGATGAGTT GAAAGGTTTC 
GACAATAAGC AACTAGCATT GCCAGCTATC 
20 AAATTAGTGA AAAATGCACC GCAAACXTTTA 
ACTGATAGTA AAAAGAAACA ATACGGTATG 
TATCCGTTTT TATTCGGCAA TGATGATTAT 
ATTCATCAGC TAGGACTAAA TTCAAAACAT 
TGGTACGACA AAGGGTATCT TCCTAAGGCA 
AAAGAAGGAA AAGTAGGACA ATTTGTCACT 

30 

ACGTTTGGTA AAGATTTAGG AGTAACAACA 
CCATTTCTAG GTGTACGTGG TTGGTATTTA 
AAAGATTTAA TGCTGTATAT CACTAGTAAA 

35 

AGCGAAATTA CTGGACGTGT TGACGTGAAA 
AAGaAGCAC GTCATGCTGA ACCGATGCCT 

40 CCGATGGGCA ATGCAAGCAT ATTTATTTCA 

GAGGCGACGA ATGATATAAC GCAAAATATT 
AAAGGAGATT AGTTATGACG AAACGTAACC 

45 CTGGTTTGGG ACAGTTTTAT AATAAAAGAC 
TCATCAGTTT TATTTCTGTT TTTTATAGCT 

CATTAGGGAC AGTACCTAAG TTAGACGATT 

SO 

CTATCTTACT CGTTGCTTTC GCAATCATGC 

GTAATGCTGA ACX3ATTTAAT CGCAATGAGG 

55 



AACX3ACAAGA TTACACTAGC ATTTGATATG 1020 

GGAAATCGTA TCGTCTAAGG GGGAGTATTC 1080 

GCCGTGGTAA TGTTATTAAT CGTAACTGCA 1140 

GATAAAGCAT TGAATAAAGA TAATTCTAAA 1200 

GATGGCGACA AGCAAATGGC GTTTTATAAA 1260 

GGCATCAAAG TAAAGCTTGT AAATATTGGT 1320 

GACGCTCCTG CAGGAAAAGG TCCAGATATC 1380 

GCCTATCTAC AAGGCTTAGC TGCTGAAATC 1440 

AATArGCAAG CACTTAAAGC GATGAATTAT 1500 

GTTGAAACAA CCGCACTTTT TTATAATAAA 1560 

GAAGAAGTTG AAGCTAATGC TGCCAAACTA 1620 

TTATTTGATG CTAAAAATTT CTATTTTAAT 16 BO 

ATTTTCAAGA AAAATGGCAG TGAATATGAT 1740 

GTCGTCAAGA ATGCTGAACG ATTACAAAAA 1800 

GCAACACATG ATGTCATGAT TGGTCTTTTT 1860 

GGACCGTGGA ACATTAATGA ATATCAAGAA 1920 

TTACCTACAG ATGGTGGCAA ACCTATGAAA 1980 

TCTGAATATA GTAAACATAA GTATTGGGCT 2040 

GATACATTAC AAAAATATAC AGATGAAATG 2100 

TCATCTAATC CAAATTTAAA AGTGTTTGAA 2160 

AATATTCCTG AAATGC6ACA AGTTTGGGAA 2220 

AATGGTAAGA ATCCTAAACA AGCGTTAGAT 2280 

AAGATTCTTC ATCCATCACA AAATGATAAG 2340 

CTAAATTAGC GGCATTATTA TCTGTTATAC 2400 

CCATTAAAGG GACGATATTT TTTATCTTTT 2460 

TTTTAAATAT TGGTTTTTGG GGATTGTTCA 2520 

CTCGTGTCTT ACTTGCACAA GGTATTATTT 2580 

TATATATCAT TAATATTTTA GATGCATATC 264 0 

AAATAAAGGA TCCGAAGcGC GTATGGTGGC 2700 
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TGTAGTTGTA TTTCCATTAA TAyyTATGTT 
CAACGCGCCT CCGAGACACA CATTAGAATG 
^ CACAATTGGC GTTTGGCGTA AAACATTTTT 

GCTTGTTGCA ACX5ACACTTC AAATTGCATT 
CCCTGTCGTC AAAGGTAAGA AATTTATCCG 

10 

ATCATTTGTG ACAATTTTAA TATTTGTAGC 
TAATGATATT TTGCAACCTT TATTAGGTGT 
GGCAAAAGTG GCATTAATCXi GCATTCAAGT 
GTTCACTGGA GTACTGCAAA GTATTTCATC 
TGCGTCTAGT TGGCAAAAGT TTAGAAACAT 
20 GCCATTGTTA ATTATGCAAT ATGCAGQTAA 

TAATAAAGGC GGTCCACCAG TGTCAGGGCA 
TTGGGTGTAT AATCTGACAT TTGA6TTTAA 
AATTATTGGA TTTATTGTTG CTATTGTCGC 
TAAAGATGAG GGAGGTTTAT AAGATGACAA 
TTTACAGTTT TATAGCGATG ATGTTTGTCA 

30 

GCATTTCCCT TAATCCAGGT ACGAACTTGT 
CATTTAAAAA TTATGCATTC TTACTATTCG 
AAAATACX5CT TATCGTAGCA TCTGCAAATG 

35 

CAGCATATGC TTTTTCTAGA TATCGCTTTG 
TGAWTTACA AATGTTCCCT GTATTAATGG 

4Q CAATTGGATT ATTAGATTCT TTATTTGGAC 

CGATGAATGC CTTTTTAGTG AAAGGTTACT 
CTGCCAAAAT TGATGGTGCA GGGCATATGC 

^ CTAAGCCX3AT TTTAGCAGTT GTTGCTTTGT 

TATTACCTAA AATACTATTA AGAAGTCCTG 
ACTTTATTAA TGATAAGTAT GCAAATAATT 

SO 

TTGCAGTACC TATAGCAATC GTATTCTTGT 
CAACAGGTGC GACAAAAGGT TAGTTTGAAA 

55 



TGGAGTAGCA TTTACAAATT ACAATTTATA 2820 

GGTTGGTTTA 6ATAACTTTA AAACX3TTATT 2880 

CAGTGTTATT ACTTGGACAT TAGTATGGAC 2940 

AGGGCTGTTT TTGGCAATTA TTGTAAATCA 3000 

TACTGTGTTA ATCCTACCTT GGGCTGTACC 3060 

GTTATTTAAT GATGAATTTG GTGCGATAAA 3120 

AGCACCAGCA T6GTTAAGTG A7CCGTTTTG 3180 

ATGGCTTGGA TTCCCATTTG TCTTTGCACT 3240 

AGATTGGTAC GAAGCAGCAG ATATGGATGG 3300 

CACATrCCC3G CATGTCATTT ACGCCACAGC 3360 

TTTCAATAAT TTTAATCTTA TTTATCTATT 3420 

GAATGCTGGT AGTACAGATA TCTTGATATC 3480 

CAACTTCAAC ATGGGTGCAG TTGTGTCATT 3540 

ATTTATTCAA TTCAGACGTA CAAGTACGTT 3600 

AGAAGAAAAA CATATTAAAA GCAATCGGTA 3660 

TCATTTTATA TCCACTACTG TGGACATTTG 3720 

ATGGTGCCAA AATGATACCA GACAATGCAA 3780 

ATGACAGTAG TCAATACCTG ACTTGGTATA 3840 

CACTGTTTAG TGTGATATTT GTCACGTTAA 3900 

TTGGTCGTAA ATACGGGCTG ATTACATTTT 3960 

CAATGGTCGC AATCTATATT TTGCTAAATA 4020 

TAACACTGGT ATATATTGGT GGATCAATAC 4080 

TCGATACGAT TCCAAAAGAA CTTGATGAAT 4140 

GTATTTTCTT ACAAATTATG CTTCCATTAG 4200 

TCAATTTTAT GGGGCCATTT ATGGACTTTA 4260 

AAAAATTCAC ATTAGCAGTT GGATTGTTCA 4320 

TCACAGTGTT TGCAGCAGGG GCAATTATGA 4380 

TCTTGCAACG CTATTTAGTA TCAGGTTTAA 4440 

TTAGGAGTGG GGCAGAATTG ATAAAGAACC 4500 
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GGGTGTGGTG GTATTGCGAA TGGCAAGCAC 
GAAATGATCG CATTTTGTGA CGTAGACATT 
^ GGAACTGACA ATGCAAAGGT TTATGATGAT 

GATGITATCC ATGTTTGTAC GCCAAATGAC 
CATGCTGGTA AACATGTGAT GTGTGAAAAA 

10 

AAAATGATAG ATACAGCTAA ATCAACAGGT 
TTCCGAGCAG ATAGTCAATT TTTACATCAA 
TACTTCGGAA AGGCACATGC CATTCGTCGT 
GACGAAGAAG CTCAAGGTGG AGGACCATTA 
ACGTTATGGA TGATGGATAA TTATGAACCA 
20 TTAAATAAAC AGCATCATX3C GGCAAACGCT 

GTTGAAGATT CTGCGTTTGG ATTTATTAAA 
TCC6CTTGGG OGATTAATTC TTTAGAAGTG 
AAAGCAGGTG CTGATATGAA AGATGGTCTA 
TATACCAAAC ACGTTGAATT GGAAAACAAA 
GATGAAGCTG AAGAAGAAGC AAAAGCTTGG 

30 

GTTGTGAAAC CGGAACAAGC AATGGTAGTT 
GCAAAATCAG GCAAAGCAAT TTACTTTGAA 
ACAAAATTAA AAGTTGGTGT GATAGGTGTT 

35 

GCATTGCTGA AACTCAAAGA CACAGTCTCA 
CAGOTGATTG ATGTTGCGAA gCGCTTTAAT 
40 CTGTTTAAAC TTGTTGATGC GGTGGTCATT 

TCTATAGAAG CATTGAACCA TGGTGTCCAT 
ACGGAAGAGT GTGATCGCAT GATTGAAGCG 
GCATATCATT ATOGTCACAC AGATGTGGCA 
GTGGTTGGTA AACCTTTAGT AGCACGTGTA 

TGGGGTGTTT TTACCAATAA AGCGTTGCAA 

SO 

CACTTGTTAG ACTTATCTTT GTGGCTACTA 

GGAAAAACAT ATAATCAATT GAGCAAACAA 

55 



ATGCCAAGTT TACAAAAAGT TGAAAATGTT 4620 

TCGAAAGCAG CGAGTGCGGC AGAAGCATAC 4680 

TACAAAGCAT TGTTAAAAGA TGACACGATT 4740 

TCGCATTGTG AAATTACTGT AGCAGGGTTG 4800 

CCAATGGCTA AAACGACAGC AGAAGCTCAA 4860 

AAAAAATTAA CAATAGGTTA TCAAAATCGT 4920 

GCAGCGCAAC GTGGCGACTT AGGAGACATT 4980 

CGAGCAGTAC CAACATGGGG TGTCTTTCTA 5040 

ATCGATATCG GTACACACGC TTTAGATTTA 5100 

GAATCAGTGA TGGGTTCAAC ATTCCATAAA 5160 

TGGGGTTCAT GGAATCCAGA TGAATTTACA 5220 

ATGAAGAATG GAGCGACGAT CATTTTAGAA 5280 

GATGAGGCAA AATGTTCATT ATCAGGAACT 5340 

CGTATTCATO GTGAAGACAT GGGTACACTT 5400 

GGCGTCGACT TTTATGAAGG TAATGAAGTG 5460 

ATTGATGCAG TTGTAAATGA TACTGAACCA 5520 

ACAAAAATTC TTGAAGCGAT TTATCAGTCT 5580 

TAACATCATA CGGTAAGGAG GCACATCATG 5640 

GGTGGTATTG CACAAGACCG TCATATTCCA 5700 

TTAGTTGCAG TACAAGATAT TAATACAGTG 5760 

ATACCTCATG CAGTTGAGAC ACCTAGCGAG 5820 

TGTACACCTA ATAAATTCCA TGCTGATCTT 5880 

GTATTGTGTG AAAAGCCAAT GGCGATGACG 5940 

GCTAATAAAA ATCACAAATT ATTAACTGTC 6000 

ATTACTGCTA AAAAAGC/^T TGAATCAGGT 6060 

CAAGCGATGC GTAGGCGTAA AGTGCCTGGC 6120 

GGTGGCGGTA GTTTAATCGA TTATGGTTGC 6180 

GGTAAAGATA TGGTGCCGCA TGAAGTGCTA 6240 

CCGAATCAAA TTAATGATTG GGGAACATTT 6300 
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GCAAGCATGC AGTTTGAATG TTCGTGGTCT 
AGTTTATCAG GAGAAGATGG CGGTATCAAT 
^ TTTGGAACTA TTTTTGAAAG CAAAGCTAAT 

AGACAGGCXSC GTAACTTTGT CAATGCGTGT 
GAAGAAGCAC GCAATGTAAA TGCCCTTATA 

10 

AAGAGCATAC AACTTTAATG ATTATCATAT 
GAGTGCTTTT CAATGAAAAT AGGTGTATTT 
GATATGTTAG ATTATGTCTC AGAATCTGGA 
AACCCAGGAG ATAAATTTTG TAAGTTAGAT 
GCATTTAT6A A6TCAATCAC AGACAGAGGC 
20 AATCCAATTT CTCCAGATCC GATAGAAGCG 

ATCCGTTTAG CAAATCTATT AGAC6TGCCA 
TCAGATGATA CCGCTAAAAA GCCTAATTGG 
GAAATTTATG ATTATCAGTG GAATGAAAAG 
TTTGCAAAAG AGCAAGATGT AAAAATTGCC 
ACACCATATA CAATGTTGAA GTTACGTGAG 

30 

GATCCTAGTC ATCTATGGTG GCAAGGTATT 
CAAGCAAATG CAATTCATCA CTTCCATGCT 
AATATGTATG GTCTAACTGA TATGCAACCA 

35 

TTCOGTACAG TTGGTTATGG ACATAGTCCA 
ATTATTAATG GATATGATTA TGTATTAAGT 
40 GAAGAAGGTT TCCAAAAAGC TTGTCAAACT 
GCAGACATGT GGTGGGCATA ATACGAACTC 
ACTGGTGGCA GTGTTGAATA AATGCATATG 
TTAAATCAAG TCATTGTTTG TAAAGAAGGT 
TACCCATTCA CAGTAACAAT CCTCACCATT 
GGTATATGAT AATAAAAAAA GCCTGTTGTC 

SO 

GGTTTCTGAA TATAATATTT CAGAATGCAC 
TGATGACAGG CTTTCATCTT TTTAAATATT 

55 



GCAAATATCA AAGAA6ATAA GGTTCACGTT 6420 

TTATTTCCAT TTGAAATATA TGAGCCCCGC 6480 

GTTGAGCATA ACGAAGACAT TGCTGGTGAG 6540 

TTAGGGATAG AAGAGATTGT GGTGAAACCG 6600 

GAAGCGATTT ATCGTAGCGA TCTTGATAAC 6660 

ATGATACAAA ATTCTCAATA TAAAAAGAAG 6720 

TCAGTATTAT TTTACGATAA AAATTTTGAA 6780 

TTGQATATGA TTGAAGTTGG AACAGGTGGT 6840 

GAGTTGTTAG AAAATGAA6A CAAGC6CCAA 6900 

TTACAAATAA GTGGTTTCAG TTGTCATAAC 6960 

AAAGAAGCCG ATGAAACGTT ACGTAAAACA 7020 

GTTGTTAATA CATTTTCTGG CATTGCAGGA 7080 

CCTGTTACAC CTTGGCCAAC AGCCTACTCT 7140 

TTGATACCAT ATTGGCAAGA TTTAGCTGAG 7200 

ATAGAGTTGC ATGCAGGATT TTTAGTGCAT 7260 

GCTACAAATG AATATATCGG TGCTAACTTA 7320 

GACCCAATTG CTGCGATTCG CATATTAGGC 7380 

AAAGATACGT ATATTAATCA AGAAAATGTA 7440 

TATGGTAACG TTGCGACAAG AGCATGGACA 7500 

TATGTATGGG CAGATATCAT AAGTCAACTT 7560 

ATTGAACATG AAGATCCTAT TATGTCAGTA 7620 

TTGAAATCTG TTAATATTTA CGACAAGCCA 7680 

GAGGTTAGTC TGAAGTTTGT CT6AAGTAAG 7740 

TCGCCAAGCC ATTGCCAAAA ATTTCACACC 7800 

GTACTTTATA TAAGTATATA GCGATGGTCA 7860 

GAAAAGAGTA TATAACCTTT TCAATAGTGA 7920 

ACAATGGTCA TAGACACGAC ATACTTTAAA 7980 

TTTAAAGATG GACGTCGATG TAGACTAAAG 8040 

CATTAATTTC TCTTCTTGTT TAATACGTAC 8100 
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TAATACACCG ATTAATTCAG GAATGATGTT 
ATATAATCCA GATTTAATAA TAGGATGGTT 
^ ACCACCTAAA GTTTTAATAA CCATAAATAA 

GCCAATACCA TTTGCAAAGC TAAATGTATC 
TACATAAATT AAAACGTGTG TTATTGCTAA 

10 

TGCACCGTCT GCTTTTAATT GTTTTGAGTG 
GATACAGAAA AAGATAAGTA ATATAGATAG 
TGTATAAGCG TTGATTTTGA CAACATAAAG 
TTAACTATTT ATTAATTTTA GTACATAAAT 
TTTTGGATAA TTTAATAATT TTAAGGATAT 
20 CTTAAOGAAA ATGATTGAGG TGACAGAGAT 

CAAAGGCATT CCATTATCGG TACAACGTAA 
CTTCGTAGTG TTCTTTGTTT ATATGGCTAT 
ACAACCGTTT TTAAAAGAGG AAATTGGATT 
AGCATTTAGT ATCACGTACG GTTTAGGAAA 

TAACACAAAA CGTATTATCT CGTTCTTACT 

30 

GGGATTTGTT TTAAGTTACT TTGGTTCTGT 

TAACGGGGTG TTCCAATCAG TTGGTGGACC 

GCCAAGAACG AAACGTGGCC GATACTTAGG 

35 

TGCCATAGCA GGTGGTGTTG CACTTTGGGG 
AGGGATGTTC ATTTTCCCAT CGGTGATTGC 

40 CXX5AAAAGAT GATCCGGAAG AATTAGGATG 

GGTCGATAAA GAAAATATTG ATTCTCAAGG 
TATCCTGGGA AATCCTGTTA TATGGATTCT 

^ ACGAATCGGT ATTGATAACT GGGCACCGTT 

AGGCGATGCA GTTAATACGA TATTCTACTT 

ATGGGGCTAC GTATCAGACT TATTAAAAGG 

SO 

GTTTATGATT ACATTTGTTG TCTTATTCTA 

CATTTCATTG TTTGCATTAG GTGCGTTAAT 

55 



TAAGAAGTAA TTTGGGTGTT TTGTAATTTT B220 

AGGTAAAATG AATAATTTTA ATGTCCAAAT 8280 

CATGATATAA GCAAAGATTA ATATAACTAA 8340 

TTTATTAATA AATGCCTCTA CACCAGCCAA 8400 

AAACTTCGAA TTTTTAACGC CATATTCAAC 8460 

ATTAATAGAT ATCTTTAAGC TGACAAGTCT 8520 

AATCATGATG TCCTCCGTCA TTATGTCATA 8580 

TATTTTATAG ATAAAGCTTG TCAAATACTA 8640 

ATGTTTCTAA GTATGTGTTT ATGTTCAGTA 8700 

TAAGCGCTTA CACCGACGTG ATATATTTGG 8760 

GAACTTTTTT GATATCCATA AGATTCCGAA 8820 

ATTATGGCTT AGAAACTTCA TGCAAGCTTT 8880 

GTATTTAATT CGAAACAACT TTAAGGCGGC 8940 

ATCTACATTA GAACTTGGTT ATATCGGATT 9000 

AACATTACTT GGATATTTTG TCGATGGACG 9060 

TATCTTATCT GCGATTACAG TTTTAATTAT 9120 

AATGGGATTA TTAATTGTAC TTTGGGGACT 9180 

TGCAAGTTAT TCAACGATTT CAAGATGGGC 9240 

ATTCTGGAAT ACATCACATA ATATCGGTGG 9300 

TGCTAATGTA TTCTTCCATG GAAATGTTAT 9360 

ATTACTTATT GGTATCGCAA CATTATTTAT 9420 

GAATCGTGCT GAAGAAATTT GGGAAGAGCC 9480 

TATGACGAAA TGGGAGATCT TTAAAAAATA 9540 

ATGTGTTTCA AACGTCTTTG TATACATTGT 9600 

ATATGTGTCA GAGCATTTAC ACTTTAGTAA 9660 

TGAAATTGGT GCATTAGTTG CAAGTTTATT 9720 

TCGTCGTGCA ATTGTAGCTA TTGGCTGTAT 9780 

CACAAATGCT ACAAGTGTCA TGATGGTTAA 984 0 

CTTTGGTCCG CAATTATTAA TTGGTGTATC 9900 
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20 



2S 



30 



35 



40 



45 



SO 



CGCGTATCTA TTCGGTGACT CAATGGCGAA AGTTGGTTTG GCGGCTATTG CTGATCCAAC 10020 

ACGTAACG6T TTAAACATCT TTGGATATAC ATTAAGTGGA TGGACAGATG TTTTCATCX5T 10080 

CTTCTATGTT GCATTATTCC TAGGCATGAT TCTATTAGGA ATCGTTGCTT TCTATGAAGA 10140 

AAAGAAAATT AGAAGTTTAA AAATTTAATA TAAATCGGAT TAAAAGTATC GCCAATCTAT 10200 

TGCAATATAG TTGGCAATCC TGCCCCGACG GCATGTGCGT GAAGAGATGA AAGATACTGC 10260 

TTCTACCCTT GCAAATATAT CATCTCTATG TCTCGGGGCA GATCATAATT CCCTGTTATG 10320 

AAGTATCCTT ATTTGCCCGA CTTAGGGTGA CTCAATGAAT TTACTCCTTA CAATAAAGAC 10380 

ATATAGC3G6T GTCAATATTG TAGGGAGTAT TGTTTTATAT TTAAACTCTC TAAAAAGCGG 10440 

ACTGAAAGAA AAGTGAAAAC TTCTCTATCA GTCCGCTTTT TCATAGAACA AAATGGAGGC 10500 

GCCATAATCA TTAGTTATGT GCTAATCTAT TTTGCTTGCT TACAATAATC ACTTGGCGAC 10560 

ATTTGTAAAT ATTTTTTAAA ATGATAGCTA AACATTTTAT ACTCTQAAAA GCCTACTTTG 10620 

TCTGCAATTT CATAGTGTTT GTAATGTCGA TCTAACAATT GCAGAGATTG TAAAATACGA 10680 

TA60GATTTA AATAATCGAC AATTGTAATA CCAACATGAT CTTTAAATGT TCGCATCGCA 10740 

TACGATTCAC TAACATCGAT ATGTTGAATT AAATCTGAAA CAGtCACTTT CGTTTGATAA 10800 

GATTGCTTAA TTTGATCCAC AATCTGGTTT ACATAATAAT CATCGTATTC TACTTTTAAT 10860 

AGTGGTTGGA AGGCATCATG ACAAGATGCT AAGCTACGGC CGTTCTGTGA TTGTTGCTCT 10920 

AATAAGGTAC GGACAAGTCT TCCTAAAATA ACTTCTAATT GTGCATGGTC TACTGGTTTT 10980 

AATAAATAAT CAAGAACATG ATGTTGAATG CCGGCTTTCA TATATTCAAA GTCATCGTAA 11040 

CTCGATAATA TGATGACATT ACAATCTAGA TGCGCAATAT CATTGAGTAA ATCGACX3CCA 11100 

TTTTTACGTG GCATACGAAT ATCAGTAATT ACTAATTCTG GCTGATGTTG TTGAATTAGT 11160 

6AT»TGCTT CAACACCATC TTTAGCAGTG TATATTGTAT TGAAATGATA GTCTCCCCAA 11220 

GGAATGATTT GCTTTAATCC TTCTOGAATA ATTCGTTCAT CATCACAAAT AACTACCTTA 11280 

AACATCTACA TTCCCCCTTG AAAGTGGTAT TTTATAACAA ATTAACGTAC CTTGATTACG 11340 

CTTTGAAAAA ATATGGAGTC GTGCATGTGA ACCATATTGA ATCATTGCTT TATTGTGTAA 11400 

ATGATTTAAT CCCAAATGCT TAGTATCAAA TACATCATTA TTAAGAGATT GGCGTACATA 11460 

TTGCAGGOGA GATGACGACA TCCCGATACC ATTGTCGCAA ACTAAAACAT GTAAATTCTG 11520 

ACGTGCCAAT GTCAGGCGTA TAGTAATGTC CAATGACTCA GTATCTCTAC CATGTTTAAT 11580 

AGCATTTTCT ATGAGTGGCT GAAGCATCAT TTTACCAATT GTCTGGTGAC GCGCTTCTTC 11640 

AGAACTTTCA ATATGGAGCT TAATCATGTC ATCAAAACGG aTGTTTTGTA TTGCAACATA 11700 
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GTAACGTAAC 

AATTAAATAT 

TATTTCCTTT 

TTGCTCATTT 

GTTTTTTAAA 

AATCGTTTCG 

TATTACTAGT 

GTTTTCATGT 

TTCATTCATA 

GTTGGCGTAT 

CtTAATTTCA 

TTTGAATTTA 

ATTATTGGTT 

GkGTATTAAT 

GAGCTCTTGA 

ATGTAAAGAT 

ATAGATGGCA 

AGACnCAAAT 

ATCTCCAAAA 

TTCTTAAGAT 

GCAtCGAACT 

TTTTTACGCT 

ATTGTTCCTT 

GCATCTTGTT 

TTAGACGTCT 

TGATGCATGC 

GAGTACGCAA 

ATATCTCCTT 

ACATATTTAT 



ATTTGCGATA 

TGTATTGTTT 

AACTGAATAT 

GATTCAAATA 

GGCGTATATG 

ATATCTTTTG 

AAAACAAGAA 

ATATCTTTAT 

AATCCGAATT 

AAAATATTGT 

CTTAAACGTG 

AGCTGATGCG 

ATAAACTGAT 

TTTAATAATT 

AACGAATTAT 

GACTGACTTT 

TAGAAGCTTA 

AACGATCGTC 

ATTTATGATG 

TTTCX5ATAGA 

CTGCAACTAA 

TAGGATGAGG 

CAATTGGATA 

CGTAGCTTAG 

TTGACAGTTG 

TATAAATAGC 

TTTTACCTTT 

GTAAATCTGA 

TGTTCGAGCG 



ATTGTTGGAC 

GCATCGTATT 

CACGCAA6CG 

AATCGTAAAT 

TACCTAGATG 

TTTGTCGTTT 

CTACGGCCAT 

AAATAATGAG 

GTTGTGGTcT 

CATATTGATC 

GGGTGTtAGC 

TTGAAAATAA 

TTGGTCCAGA 

CACGTTTTGT 

TATGCTGTGT 

CATCAACATG 

CTAGTCCAAT 

TTAATTGATG 

TGGAATATCC 

CTGATCGCTT 

TCGTtGTTGT 

GTGTGCATTT 

AACGATTGAT 

ACCTGCGTAA 

CATCGCATGG 

ACGCATATGT 

AAGTATAGGT 

ATTCACTACT 

ATAATCCTCT 



CACAGTTtGT 

GAATAGGAAA 

ACGTTCTGTA 

ATAATTATTA 

ACGATTTTTG 

AGCCATATTA 

AACAATTAAC 

ACGATGGTCA 

ATACTTTTCA 

AmCGATAAGT 

CATATAAATt 

ATACATATTT 

TAATTCATAA 

AGCGGTCACA 

AATAAATGTC 

TTGATGAATC 

AATAATGACT 

TCTATAAGGT 

GGTAATTTAG 

TGTTCACTAA 

ACTGAGCGGC 

TTAACTAAAG 

ACAGGATAAC 

TATTTACCTT 

TTTTGGAATT 

TGATAGCCTG 

TGTAATAAAT 

ATAACTGTTG 

AATTGCTGTG 



GCTAATTTCG 

TGAGGCTGGA 

TGCTC6ATAG 

ATTTCTTCTA 

GCATAGTAAA 

TCTGCGCTAA 

AACGTGATAC 

GCATGGTTTA 

CCTATAGTAA 

GCGAATTGTC 

TTaAGCATAT 

TTAGTGTTTA 

TAAAGTGTTG 

TCATGATGAT 

TGAATCTGCT 

GTACGATGCT 

AAAAATACTG 

TTGTATGCCn 

ATTTCGGTAT 

CATCCTTTCG 

TTGTTAAATA 

CAATrCCATC 

CTTTGTTTTT 

TTGCAACATC 

GATGCACATC 

TCGTTGTTGT 

CTTGATAACC 

GCATTAATAG 

TTACAGATGT 



GAGATAACGT 

ATTGGCGTTC 

AATGGATCAG 

GTTCACTGTT 

TTTTTTGAAT 

TGAAACCAAA 

CATCTTCAAT 

ATTTTACAGA 

AACX3GTCATC 

GGTTATCTTT 

ATGTACTATT 

AATGTTCATA 

CGGGCTGTTG 

TTGyTAAATC 

TTTCAGTATG 

CAATCCAAAT 

GAAAAATAGT 

TCATTGAATC 

TAAAGGTATG 

AATTGACTTG 

TTGCACTAAC 

AACATTTAAC 

CCATGTGCGT 

TTCAATGACT 

ACTTACTCGA 

ATTTGGATTT 

TCGAATCTTA 

AAAACTAGTA 

ATCTTGATAG 



11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 
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CCACGCTCCG AAAAATCTTC GTTATGCAAG TTTGAAAGCA GTACTTGAGT AGATCCGTGT 13620 

TTAATTTCAA TTTTGACATG CTCTTGTTTT TCAAATTCAT TTAAAATTGG ACGAATCAAG 13680 

5 

TTTGATTGAT ACGGAGAATA AACTGTTAAT ACATTTTTAT CGGATTCAGA" GTGACGCGTA 13740 

TTAGCGCATG CTGaTAAAAA AATGAGAAAT AATAGCAAGA TATAAATTTT TGATTTCATG 13800 

ATATCCCATC AATTCTATGT ATATTTTAAT ACAATAATTT TAGCAATAAA TGACGCATAA 13860 

10 

GTAAT6TTAA ATATTTAGAA ATGTTTATAG ATGACTTGTT AAGACGTTGC AAATGTTGTG 13920 

ATAGCACAAA ATTTTTGTTT GTCAAGACGA TTTACCGAGG CTGTAAAATC AAACTGTTAT 13980 

^5 ATTTTATTTG TAGCTGTTAT ATAAAAATCG GCAAGATATT GAACGGTTCA AAAGTGAATT 14040 

TTTAC6TCAA TAAAAGTATT TAATCCAGTC TCTTCATATA TAAAAGTAAA TCTTTCTAAG 14100 

TGTTGATTTA ACX3CTTATCA ACAATCATTT TTTATAAACA AATATATACT CCTAAATTAA 14160 

20 CTTTTAAAGC AATGAAAATA GTGAACATTA TAACTGTTGT GTAACAGAAT GCAATTAGCA 14220 

TATTACTGTT ACACAAATTA GTACAGTTTC TATGTTTTGA CATACATTTG ATGAAAATTG 14280 

TACATAATTT ATGTGAAAAA AATCACAACA AACATGCTAC AATGACTATG AAAACGTTAA 14340 

CATAGCATTT CAAATTCACA ACATTATACA GATGGAGGCG TTTAGTATGT TAGAAACAAA 144 00 

TaAAAATCAT GCAACAGCTT GGCAAGGATT TAAAAATGGA AGATGGAACA GACACGTAGA 144 60 

TGTAAGAGAG TTTATCCAAT TAAACTACAC TCTTTATGAA GGTAATGATT CATTTTTAGC 14520 

30 

AGGACCAACA GAAGCAACTT CTAAACTTTG GGAACAAGTA ATGCAGTTAT CGAAAGAAGA 14580 

ACGTGAACGT GGCGGCATGT GGGATATGGA CACGAAAGTA GCTTCAACAA TCACATCTCA 14640 

TGATGCTGGT TATTTAGACA AAGATTTAGA AACAATTGTA GGTGTACAAA CTGAAAAGCC 14700 

35 

ATTCAAACGT TCAATGCAAC CATTCGGTGG TATTCGTATG GCGAAAgcAG CTTGTGAAGC 14760 

TTACOGTTAC GAATTAGACG AAGAAACTGA AAAAATCTTT ACAGATTATC GTAAAACACA 14820 

40 TAACCAAGGT GTATTCGATG CATATTCTAG AGAAATGTTG AACTGCCGTA AAGCAGGTGT 14880 

AATCACTGGT TTACCTGATG CATACGGACG TGGACGTATT ATCGGTGACT ATCGTCGTGT 14940 

AGCTTTATAT GGTGTAGATT TCTTAATGGA AGAAAAAATG CACGACTTCA ACACGATGTC 15000 

^ TACAGAAATG TCAGAAGATG TAATTCGTTT ACGTGaAGAA TTATCAGAAC AATATCGTGC 15060 

ATTAAAAGAA TTAAAAGAAC TTGGACAAAA ATATGGTTTC GATTTAAGCC GTCCAGCAGA 15120 

AAACTTCAAA GAAGCAGTTC AATGGTTATA CTTAGCATAC CTTGCTGCAA TTAAAGAACA 15180 

SO 

AAACGGTGCA GCAATGAGTT TAGGTCGTAC ATCAACATTC TTAGATATCT ATGCTGAACG 15240 

TGACCTTAAA GCAGGCGTTA TTACTGAAAG CGAAGTTCAA GAAATTATTG ACCACTTCAT 15300 
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AGACCCAACT TGGGTAACTG AATCTATCGG 


TGGTGTAGGT 


ATTGACGGAC 


GTCCACTTGT 


15420 




TACX3AAAAAC TCATTCCGTT TCTTACACTC 


ATTAGATAAC 


TTAGGTCCAG 


CTCCAGAACX: 


15480 


5 


AAACTTAACA GTATTATGGT 


CAGTACGTTT 


ACCTGACAAC 


TTCAAAACAT 


ACTGTGCAAA 


15540 




AATGAGTATT AAAACAAGTT 


CTATCCAATA 


TGAAAATGAT 


GACATTATGC 


GTGAAAGCTA 


15600 


to 


TGGCGATGAC TATGGTATCG 


CATGTTGTGT 


ATCAGCGATG 


ACAATTGGTA 


AACAAATGCA 


15660 


ATTCTTCGGT GCACGTGCX3A ACTTAGCTAA 


AACATTACTT 


TACGCTATCA 


ATGGTGGTAA 


15720 




AGATGAAAAA TCTGGTGCAC 


AAGTTGGTCC 


AAACTTCGAA 


GGTATTAACA 


GCGAAGTATT 


15780 


IS 


AGAATATGAC GAAgTATTCA AGAAATTTQA 


TCAAATGATG 


GATTGGCTAG 


CAGGTGTTTA 


15840 




CATTAACTCA TTA7ATGTTA TTCACTACAT 


GCACGATAAA TACAGCTATG 


AACGTATTGA 


15900 




AATGGCATTA CATGATACAG 


AAATTGTACG 


TACAATGGCA 


ACAGGTATCX3 


CTGGTTTATC 


15960 


20 


AGTAGCA6CT GACTCATTAT 


CTGCAATTAA 


ATATGCACAA 


GTTAAACCAA 


TTCGTAACGA 


16020 




AGAAGGTCTT GTAGTAGACT 


TTGAAATCGA 


AGGCGACTTC 


CCTAAATACX5 


GTAACAATGA 


16080 




CGACCX3TGTA GATGATATTG 


CAGTTGATTT 


AGTAGAACGC 


TTCATGACTA 


AATTAC6TAG 


16140 


25 


TCATAAAACA TATCGTGATT 


CAGAACATAC 


AATGAGTGTA 


TTAACAATTA 


CTTCAAACGT 


16200 




TGTATACGGT AAGAAAACTG 


GTAACACACC 


AGACGGACGT 


AAAGCTGGCG 


AACCATTTGC 


16260 




TCCAGGTGCA AACCCAATGC 


ATGGCCGTGA 


CCAAAAAGGT 


GCATTATCTT 


CATTAAGTTC 


16320 


30 


TGTAGCTAAG ATCCCTTACG 


ATTGCTGTAA 


AGATGGTATT 


TCAAATACAT 


TCAGTATCGT 


16380 




ACCAAAATCA TTAGGTAAAG 


AACCAGAAGA 


TCAAAACCGT 


AACTTAACTA 


GTATGTTAGA 


16440 




TGGTTACGCA ATGCAATGTG 


GTCACCACTT 


AAATATTAAC 


GTATTTAACC 


GTGAftACATT 


16500 


35 


AATAGATGCA ATGGAACATC 


CAGAAGAATA 


TCCACAGTTA ACAATCCGTG 


TATCTGGTTA 


16560 




cgciSttaac TTCATTAAAT TAACACGTGA 


ACAACAATTA 


GATGTAATTT 


CTCGTACATT 


16620 


40 


CCATGAAAGT ATGTAACAAA ATTTAAGGTG 


GGAGCACTAT 


GCTTAAGGGA 


CACTTACATT 


16680 




CTGTCGAAAG TTTAGGTACT GTCGATGGAC 


CGGGATTAAG ATATATATTA 


TTTACACAAG 


16740 




GATGCTTACT TAGATGCTTG 


TATTGCCACA 


ATCCAGATAC 


TTGGAAAATT 


AGTGAGCCAT 


16800 


45 


CAAGAGAA6T CACAGTTGAT GAAATGGTGA 


ATGAAATATT ACCATACAAA 


CCATACTTTG 


16860 




ATGCATCGGG TGGCGGTGTA ACAGTCAGTG 


GTGGCGAACC ATTGTTACAA 


ATGCCATTCT 


16920 




TAGAAAAATT ATTTGCAGAA 


TTAAAAGAAA 


ATGGTGTGCA 


CACTTGCTTA 


GACACATCGG 


16980 


50 


CTGGATGTGC TAATGATACA 


AAAGCATTTC 


AAAGGCATTT 


TGAAGAATTA 


CAAAAACATA 


17040 




CAGACTTGAT ATTATTAGAT 


ATAAAACATA 


TTGATAATGA 


CAAACATATT 


AGATTGACAG 


17100 
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70 



15 



20 



25 



30 



35 



40 



45 



SO 



TATGGATTCG ACATGTCCTT GTGCCTGGTT ATTCTGATGA TAAAGACGAT TTAATTAAAC 17220 

TAGGGGAATT TATTAATTCT CTTGATAACG TCGAAAAGTT TGAAATTCTG CCATATCATC 17280 

AGTTAGGTGT TCATAAGTGG AAAACATTGG GCATTGCATA TGAATTAGAA GATGTCGAAG 17340 

CGCCCGATGA TGAAGCTGTT AAAGCAGCCT ACCGTTATGT TAACTTCAAA GGGAAAATTC 17400 

CCGTTGAATT ATAAATACAA TTCAGACCGA AAAGAAAGCA TATGCAACTT CAAGAGTGAA 17460 

GGGGCATATG CTTCTTTTTC AATTGAGTAT TGAGTATTAG CAAGACGTAG TAAGTATATG 17520 

AGACAACTTC TACAATGGrT GAAGGAAGAC GTTTTTGTAA GTAGCTATGC TGATAAAGAA 17580 

TGTGATGTCT TGTTAAAGGT GGGGTTCCAA TATCATCATT TAGCTGATGT TGAATGGGTT 17640 

ATTATTTGCT ACTTGCATAT GAATATGAGT CTTTTCAAAT TTTTATTGAC CCTGAGTAAT 17700 

GAAAAATATT AA5ATGAAAC TTAATATTAA AgCAATGCGG AGCGTGATTA TGAAGAGAAT 17760 

TAGTAAAGAT ATATGGGCAG TATTTAAATT ACTGTATCaA AATAAAGGGC GTTTTAGCAT 17820 

TAATGCCTTA CTATTGCAGT TAATCATGAT TTTTATTAGT AGTACATACT TAATTTTACT 17880 

ATTTAATATG ATGTTAAAAG TAGCTGGcAA AGCCAACTTA OQATTAACAA TTGGACGGAA 17940 

ATCGTTAGTC ATCCCGCCA6 TGTGATACTT CTTATTATAT TCATATTAAG TGTTGCCTTT 18000 

CTGATTTATG TAGAGTTTTC ATTGTTAGTT TATATGGTTT ATGCCGGCTT TGATCGACAG 18060 

ATTATTACAT TTAAATCCAT TTTTAAAAAT GCCTTTGTAA ATGTGCGTAA ACTCATAGGT 18120 

GTACCAGTTA TTTTCTTTGT CATTTATTTA ATGTTAATGA TACCCATTGC CAACCTAGGA 18180 

CTAAGTTCAG TATTAACAAA AAATATTTAC ATACCTAAAT TTTTAACX3GA AGAACTTATG 18240 

AAAACGACGA AAGGTATAAT CATTTACX5GT ACCTTTATGA TTGCTGTATT TATATTAAAT 18300 

TTTAAATTAA TATTTACTCT ACCGTTAACG ATTTTAAACC GCCAGTCGTT ATTTAAAAAT 18360 

ATGAGACTAA GTTGGCAAAT TACGAAGCGA AATAAGTTTC GGCTTGTTAT AGAAATAGTT 18420 

ATATTAGAAC TCATCATTGG TGCGATTTTA ACATTAATTA TTTCAGGAGC AACATATCTT 18480 

GCTATTTGTG TAGATGAAGA AGGAGATAAG TTTTTAGTCT CATCAATTTT ATTTGTTGTA 18540 

TTGAAAAGCG CATTGTTCTT CTATTATkTA TTtACGAAAT TATCATTAAT CAGTGT6TTA 18600 

GTACTGCACT TAA 18613 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1214 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 

ID) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

AAAGTTTTAA AAGGGGTGAG ATACTTGGCG AATAATCCAT TCCAGCTTTG CGTTTAAAAG 60 

5 GAATTATACT TGCCATTGTC GGTGCTTGTT TATGGGGATT AGGTGGTACT GTITCTGATT 120 

TCTTGTTCAA ATATAAGAAT ATTAATGTCG ATTGGTACGT CACTGCTCGA CTTGTAGTCA 180 

GTGGTGTTTT CTTACTTATT ATGTACAAAA TGATGCAACC CAAACGTTCA ATATTTAGCG 240 

10 

TATTCCAAGA TCGACGTATG TTAGGCAAAT TACTTATCTT CAGTATACTG GGCATGTTAG 300 

TAGTACAATA TGCTTATATG GCATCTATTA ATACAGGTAA TGCTGCGATT 6CAACATTAC 360 

TACAATACAT TGCGCCAGTT TATATTATTA TTTGGTTTGT CATAAGAGGC GTTGCAAAAC 420 

IS 

TAACATTATT TGATGTGCTT GCTATTATCA TGACACTATT AGGAACATTT TTATTATTAA 480 

CAAATGGTTC ATTTTCTAAT TTAGTCGTCA ATCCTGCAAG TTTATTCTGG GGTATTTTAG 540 

2^ CTGGTGTAGC ACTCGCTTTT TACACAATTT ATCCTTCAGA CCTACTTAAC CGCTTOGGTT 600 

CGATTCTAAT TGTCGGGTGG GCAATGCTTA TTTCTGGTGT TGCGATGAAT TTACGCCATC 660 

CAATTTGGCA CATTGATATC ACTAAATGGG ACATATCAAT TATATTATTT TTAATCTTTG 720 

25 GTATTATCGG TGGTACCGCA CTCGCATTTT ATTTCTTTAT CGACAGTTTA CAATACATAT 780 

CAGCGAAAGA AACAACATTA TTCGGAACTG TTGAACCTGT CGTAOCCGTT ATCGCAAGCA 840 

GTCTATGGTT ACATGTGGCA TTCAAACCAT TTCAAATCGT AGGCATCATT CTTATTATGA 900 

TTTTAATTTT ATTACTATCA CTTAAAAGAC AACCTGAAAC ATTAGATGAA TAAGAAAACT 960 

CTGATAATCA CTTTAGCAAG TAACTATTAT TTAACAACGT AGTTACCTTA TAGGTGATAT 1020 

CAGAGTTTTT TATTTTAGTT AATAATATTT TTCACTTGGT ATAAAAAaGC GTCGTCGCTC 1080 

35 

TGGTAATCGG AAATACTGGA ATAAAATATG GAATTGGGTA ATAATCCCAG GTAnTAAAAG 1140 

TCCAIGTTCC GATAnCCTnT CCGCAnCTCC AACCAAATTT GCCGATAAGG TTCCAAAAGG 1200 

CATCCTGGGG GTAC 1214 

40 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9458 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114; 
ATTTTGGTTT CATTCACGAT GGGGTnATAC AGCAAACACA nCTAAAATAA CTATCAATAG 60 
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CTTAGACAAT AAAAAATATG CCACTACAAT CGCTAATATT ACGATTAAAA AAGAAGCGTT 180 

AACGATTACT TTCATCGTTG TTCTATCTCT GAACATCATA TTAAAGACAA CTAGACTAAT 240 

^ TGATAATGAA ACAGCAAAAA AAGTAATAGC TAACACTAAT TTCATCATAA ATAGACAGAC 300 

TAAACCTATG ACTAATAATG TATTAGAAAT TACAGCTGAC GTTTTTAACA TTCTCGaATT 360 

AATATGCACT CACCCTTTTT ATTTAAATAA CTTACATAAT CATAATAATA CATGATGTTT 420 

10 

CATAGGCCTG TCGATGATTG ATTCACAATA GCACGTGATT TTTTTGTTTT TCAATATTAT 480 

TCATTTATTC CATCAAAAAC ACCCTTTTTA ATTTTTACAA AAATTAAAAA AAGTGCTCCT 540 

ACACTGCTTG CATGTAGAAA CACTTTTTCA TTGTAATGTT ATTCTTCTCG AGACATACCT 600 

IS 

TTTAGCATAT TAAOCATGTA TGTTAAACTA CGGTTCATGT CGTCATCTTT CAATACGCCC 660 

AATAGACTTC TTATAGTTGT CTTAGCATTT GGACTCOCTT GATTGGCAAC GTGTAATCCT 720 

20 TTATTAACTT TATTTAGGAA GTCGCTTAAA TCTGATACAT TGAGTTCACC TAATAAAAAT 780 

ACCATTGAAG CCATATTAGA TAATAGCCCT GTATAAATAT CTTTATTAAG TTCAACTGCA 840 

AATTTATTTA TGATGACTTG ACGTCCTCGA ATTGCACCAT TTAAAGCATC TAATAGTTTT 900 

25 GCATCATCTA ATGTTTTAAT AAGCTTGATT GCTTTTAATA TACTATCTTT ATTCGCTGCA 960 

ATTGCCTCTG TAACTTCATT TAAACTTTCT AACTTAATTT GTTCTTCTGA TTTTTCTAAG 1020 

CGTCTAATTT TAGAAGATAT TCTCTCAGCC ATTATTTATC CACCTGATTT CCCGGGAAAA 1080 

30 

CATAATCTGA ACGTTCCCAT TTTTTCTGTA CTTGAACACT GTACTGCGGT TGACGTTTTT 1140 

TATTGACACG GAAATTATTA GGGTTCAACG GTGACTTACC ACGTTTCGTA ATTACCTCCA 1200 

AACGACA6CT AGTACGTTTA TAAGATGGTG TATCCGTGTA TTGATCAACA TCACTaTTAG 1260 

35 

TTAATAAGTT AATTGCACCT AGATCTCCAT TTTCCATCGC aTCaTTATTT AATGGAATAT 1320 

AGAOTCTTT ACCTTTAACA CGATCTGTCA CGTGAACTTG TAATACCGCT TCTCCTGTyT 1380 

CAGAAATCAG CTTAACTTCT GCACCTTCAT GAATGCCTCT ATCTTCAGCA AGCTCTGGAG 1440 

AAATTTCAAC AAATGCACGT GGCACTTTGT ATTTAATCAT TGGTGTTTGA TAAGTCATAT 1500 

TACCTTCATG GAAGTGCTCT AACAATCGAC CATTGTTTAC ATGAATATCA TAAATTTCAT 1560 

45 CTTGCTTAAA GTAATTATCA AATGATAATG GGAATAATTT TGCTTTACCA TTATCAAAAT 1620 

TGAATCCTTC TAAGTATAGA ATAGGCTCAT CAGTACCATC AGGTTGTACT GGCCATTGTA 1680 

AACTATTGAA TCCTTCTAAA CGATCATAAC TTACCCCAGC ATATAGAGGT GTTAAGCGTG 174 0 

SO 

CTACTTCATC CATAATTTCA CTAGGATGCT TGTAATTCCA ATCAAATCCT AATCTATTAG 1800 

CAATTGCTTG GAAAATTTTC CAGTCAGGTT TTkAATCACC AAGAGGTTCT AATGCTTGGT 1860 
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TTGCTGGCAA TACAACATCT GCGTATGTTG 
CCATGAAATC TAATTTTTCA AACGCAGCTT 

5 

CCGTATCTTC ACCATATAAG TACAATGAGT 
TTTCATGATT ATCTTTACCA GCTTTTGGAT 
TAGCGCGAAT ATCATCCGCT TCAATACTTT 

10 

CCATATCACT ACATCCTTGA ACATTATTAT 
GACGACGATA ATTACCTGTT ACTAATAATA 
CAATGTCTTG TTGTGTAATA CCCATTGCCC 
ATTCTTCAGC AAATTTAATC AATTCTGATT 
CCATTGTAAA TGTTTCTAAT GATTTGTAAT 
20 TAAATGCTTT ATCGTGTAAA TXiaVTGATCAA 
CTAAATCCGT ACCTGGTTTA GGTTGATAAA 
TTCTAATATC AAATACATGT ATTTTTTGAC 
ATGCGATAAC TGGATGAGCT TCGGCTGTAT 
TTTCTAAATC TTCAATACTA CCTGAGTCAC 
TTGTTGCAGG TGCTTGGCAA TATCTTGAAC 

30 

GTCTTGCTAA TTTTTGCATT AAATACGATT 
ATGATAGTGC ATCTGGGCCA TGCTTTTCTT 
TTAAAGCTTC ATCCCATTCT ACTTCATGGA 

35 

TTAATCGTTG ATCTGAATTA ATATGTCCCC 
TTTTATTTGC TGGAGAATCA TGTGATGGTT 
40 AAACTTCAAA TGAACAACCC ACACCACAAT 
GCTCTTTACG CATTTCTGCT TCTGAATCTG 
CTGCTTTTTT AGTTAAATCA ATCATTGCTG 
AACCCGCATT ACCTTCCATA TTCACTTCCA 
ATTGACCACA AGATACACAT GAAGACTCAT 
GTGGATGTTC ACGATCCCAA TCAATTCTAA 

SO 

CTTCTACACA ACGCCCACAT AAGATACATT 
AATCTTTTTC GTATGGCTTC TCTTTATATT 
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CTGTGAATGT TAAAAATTCA TCTTGGACTA 1980 

GTACAAAATT AATATTTGAA TCCACAATAC 2040 

GTACTTCTCC GTCATGTATA CCTTCTACCA 2100 

TCAATTTAAC GCCATATTCT TTTTCAAATT 2160 

GATAACCAGT AATCTTATCA GGCATACTTC 2220 

GTCCACGTAA TGGATACGCA CCAGTACCAG 2280 

AGTTTGAAAT CGCTGTACTT GAGTCACTAC 2340 

AACAAATTAC AACAGATTCA GCTTTAGCAC 2400 

CAGGAATACC TGTTGCTTCT TCAGCAAAAG 2460 

ATTCATCAAA ATCATCTACC CACTCATCAA 2520 

TAATATACTT AGTCACTGCA CTTAACCACG 2580 

AACGATCOGC ACGTTCTGCC ATTTCATGTT 2640 

CAAATAAITT TTGTGCACGT TTCATGCGTG 2700 

TAGTACCTAT CAATACAGAC ATTGCCGCTT 2760 

CGCCGTGTCC AACCGTrCTA AATAAGCCTT 2820 

AGTTATCAAC GTTATTTGTG CCAATAACTT 2880 

CTTCATTCGT CGCTTTAGAA GAAGAAATGA 2940 

TAATAGCTGT AAAATTATCT GCAATGACGT 3000 

ACTCACCATT TTTCCTTACT AGTGGTTTAG 3060 

ATGAAAACTT ACCTTTAACA CAAGTCGCAA 3120 

GTACTTTTAA AATTTCTCTA TCTTTAGTCC 3180 

AAGTACACAC TGTTTTAGTT TTCTTAATAC 3240 

AGATT6CAAA TAGTGGACCA TAACCAGGTT 3300 

CTAATGAACC AGGTTCCGTA TCAGTCATAT 3360 

TCATGGCATT ACATGGACAT ACCGTCGCAC 3420 

TAATCGGTAC ATCATTATCC CAAATAACAC 34 80 

TAGTTTCATT CACTTCGATA TCTTGACATG 3540 

GATTTGGATC ATAACGATAA AATGGGCCGT 3600 

CATACGTTTG ATGCTGAAGC CCCCATGCAT 3660 
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TATGCTTTTC TAAAATTCGA TCAAGCGCTT 
CAGTATTTAC AGTCATTGGA CGATCAATCA 
^ CAATCTCAAC AGTACATGTA TCACATGTTT 

TTGAAGGTAC AAAAGTATCT TGTGATTTAA 
CAAGATAATC TTTTCCATCA AGTGTAACCA 

10 

CTATATATAT TTTCCGTAAA TGACTTTTAA 
TGCCCCACAC ATCTTTCAGA TAGAATTAAT 
AAGTAAAATT TTGTATTTTG CCTTTTTACA 
A3TAAATCAT CTTTTTGTTT AATTGAAAAT 
TTTCACGCTT TTTGCCATAT CTTTCACAAC 
20 CACCTAAAAA TCGTTATACT ATTTATAAAT 

TTGATAAATA TCTACTATCA TTTAGAAGGT 
GATTAATTTA TAAAAATCAA ATCAGGCATT 
ATCACCTTCT ATTTACGGGC TATTAGTTCT 
CTAATTAATT TGTGTACAAT TTTGATAACT 
TCTTTTAATA ACTTAGTACT TTCAGCTTTT 

30 

CCGTCACTTT GAATGCCGCC TTGACCACTC 
TCTTTATAAT TGCTTCTAAT CGTATTCAAA 
TTTTGAATTT CATTCATTAG ACTATTAAAA 

35 

TTGGCCATCG CTTCAAGCAC AATTTGCTGA 
CCrrCTTCTT TACGACTTCT AATAAACTTC 

40 TGTCCTTTTG TAAAACGAAC ACCATCAACA 

CCXSCCGATGC TATCTATCAT ATTATGCAAA 
ATTGGCACAT TCATTAATTT TTCAAGTGAT 

^ TAGGCATGTG CAATTTTTTC AGTAGTACCA 

GGTATACTTA CTATTTCAGT TTTCTTCGTT 

tCACTACGCT CTCCGCCACC CTTTTTCTTA 

SO 

GCGATTGTGA ATGGATCACC ATCGTTTAAA 

TTGCGATCTA ACGGATTGTG TATCTTATTA 

55 



CTTTTTGAGC ATCTTTCACA TCATTGTTCA 3780 

CCGTACTACA TGAACGTTCA ATTTTACCGT 3840 

GAATTGGTCC CATCGACTCG TTATAACAAA 3900 

TAAATTCAAG TAAATTCGTA CCTGGTTCTA 3960 

CCAAATGTTC TTGCATATTA CTCACCCCGT 4020 

TAAATTGCTC ATATCCACCT AAAATAACX5A 4080 

TTAATTGTAT TACTTTATGT ACTAGTTGTT 4140 

ATCATTTTTA TTTGAAATAT TTTGCGCGAA 4200 

AATTATCATT ATTAGTTTTC CAATTATCTG 4260 

CTTATTAATG ACAATATTTA ATAATCACCT 4320 

ACCCTTTTTC TGAAAATTAA TAACCCAAGT 4380 

AATATTTATC TTTAAATTAA ATTTGTAATG 4440 

AAATAAAATA GCCCATAAAT ACAAAGTGTT 4500 

ATTCGTTATT CTATTTACAG ATCATTCTAT 4560 

TATTTTCCCT TAGTTTACTA CTCTAGATTA 4620 

GACTGCTCAC TAGGAATGAA GTAGTACAAT 4680 

AATTGATGTT TATTAATCGT GTCATTAGCA 474 0 

TCACCTAATG TTAAATCTGT TTTAACATTA 4800 

TGTGTAATCG ATGATGGGCT TGCAATCTTA 4860 

CGTTGTTGTC GACCAAAGTC ACCACCAGCA 4920 

AAT6CTTGAT CACCATTTAC ATGTGTCTGC 4980 

QTGAATGTAT CATTACTTAC TACATCAACA 5040 

CCATCCATAT CGATTCTCGC ATAATGATCA 5100 

TTAACAGCCA TATTTGGTCC ACCATATGCA 5160 

CGGCCAACAA TTTCCGCTCT TGTATCACGC 5220 

TTAGGGTTGA TAGATAAAAT CATAATACTA 5280 

CGATCAGCAT CTGAATCGAC ACCAAATAAA 5340 

CTCACTTTTT TATCTCTTAA TTCTGAATGA 5400 

CCAGTAATAA AAATTTTAGC AGCTACATAC 5460 
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10 



15 



GGTAGGCTCA TTTTACTTTT AGACGAACGT TTCAATCCCA CCACTCCTTT ACTATTCCTT 55 80 

ACATACTTTG TCTGTTTTCT CTATTTATTA TATAGTAAAA TAATTTTTTT ACTATACTTC 5640 

TGTAGACGTA TAACTATTTT TTATCATTTT TTATCTCTAG AGAATATCTA TCTGTATTTT 5700 

TGATAACCAC CATTTGCATT TAAAATTTTA AGTACCGTTT CATGACATGC TTTATTACTT 5760 

ATAATAAAAG GTGCACCCTT TAAATGATCA ATTGCCTTAC CATCTAAAGT CGTCATTTTT 5820 

AGATTCAATA GTTCTGCAAA TAAAAACTGT GCAGCAATGT CCCAAGGTTT AGGATTTGTA 5880 

TTAATATGTG CCCCAAATTG ACCTTTTGCC ACTCGCATAG AATCTAATCC GCAAGCACCA 5940 

ACTAAACGAT AACTAAATGA GGCGTCAAAT AAATCTTGCA CCGTATCTAG ATTCATCACT 6000 

TGTGCATTAA ACGATATAAT AGCGTCTTCC AATTTTAACG ATGGTGGTTC TTCCATCTTA 6060 

ATTCCATTAC AAAAAGCACC TTCTCCTCGT ATTGCTTTAT AAAOCTTTTT ATGCGQATAA 6120 

20 TCATATACGT ACGATAACAT TGGTTTACCT TCATAAAAAT ACGCCAATAT AATACAATAA 6180 

TCTTCTTGCT GTTTTACTAA ATTGGCAGTT CCATCAATGG GATCCATAAT CCATAAATGA 6240 

TTAATTTCAT TCGTAATCAT TTCATTACTT TTTTCTTCCG CTAATAGTTG GTGTTCCX3GA 6300 

2S AAATGTGTTG CTAAAAATTG TTGGAATTGT TGTTGAATCT GTTTATCTAC ATTTGTAACT 6360 

AAATCAAATC GAT6ACGCTT AGTTTCTGTA GTCATTTCCA TAATTAATTG CXK3AATAACA 6420 

TTGTCTATTT GTTTCAACCA CGAACATATT AACTTATCTA TTTGCTGTAA TGTTTTATCT 6480 

GTCATTTCGT CCACCACTTC TCATATCATT ATCATTTTAT TATTACCCTA TATTAAAAGA 6540 

ATCAACAATA CAACTGAAGA CTTCTTCATT TTATGCATAA AAAAATCGGC TAGTCACGTG 6600 

CTAGCCGACA AATAGAAAGG AAAGTAAGTA ATAAATATTG AAGATGTTGT GATGTAACTT 6660 

GAACGATTAA AAGCTATCTG TTATATAGCT CTACCCCTTT GTTTAATCGC TCCCCCTGTT 6720 

ACAAGTAATA TCATAGCACA ATCTTTTTTA AAATGTAAGC GTTTTCCACA AAATTTTTAC 6780 

GATTTTTTTA AAAAGATATT GAAAATGTCC TCATTGTCAC TCTTATGTTA TACTTTGTGT 6840 

AATATATCAT CTTTTAGGAG GTGGCTGTCA TGAATAAAGC TGAAAGGCAA AATTTAATAA 6900 

TTACTGCAAT TCAACAAAAT AAAAAAATGA CCGCTTTAGA ATTAGCTAAA TATTGCAACG 6960 

45 TATCCAAACX5 CACAATTTTA AGAGATATTG ATGATTTAGA AAATCAAGGT GTTAAAATTT 7020 

ATGCGCATTA TGGGAAAAAT GGTGGTTACC AAATACAACA AGCACAATCT AAAATTGCAT 7080 

TAAACTTATC TGAAACACAA TTATCAGCCT TATTTTTAGT GCTTAATGAA AGTCAGTCGT 7140 

SO ACTCQACATT ACCATATAAA AGCGAAATCA ACGCAATTAT AAAACAATGT TTAAGTCTTC 7200 

CACAAACACG CTTAAGAAAA TTGCTTAAAC GCATGGACTT TTATATTAAA TTTGATGACA 7260 
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ATAGTTACTA ATGAATTGAA TAAGTTCAAA GGCTTTGAAA CATCATATAT AATAAACGAA 9180 

AATCAAGTTT CCTATTATGA AATTATAACA CTACTTAATA AACGTCCCCT CgACAAGTCG 9240 

^ ACTATGGTAA CAAAATTCAA TATCTTAATT TTTATCATAC AGAACTATCT AACGCATTAT 9300 

TTGCAATTAA ATTTGCCCAT TAACCTATTT TTCATAAAAT GTCATTTAAA CAAGTTATTT 9360 

ATTAAAATTC ACTTTATTAC ATAAATTATA CAATTArAAA GTTTCTTCAA ATTGTAAAGA 9420 

10 

TGCATTAATC GAGTTATAAT CATAATGATT AAGATGGT 9458 

(2) INFORMATION FOR SEQ ID NO: 115: 

,5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 910 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

AnGCGTATCA TGTCACGCAT TTTAACTACT TCTTTACCAC AAGATTATAC AGTCACATTA 60 

25 

GTTGATCGTA TGCCATTTCA TGGATTGAAA CCAGAATTTT ATGCTTTAGC TGCGGGCACG 120 

AAATCAGATA AAGATGTTCG TATGAAATTC CCTAATCATC CACAAGTGAA TACAGTTTAT 180 

GGTGAAATTA ACGACATAGA TTTAGATGCT CAAATTGTCT CAGTCGGTAA TTCTAAAATT 240 

GATTATGATG AGCTAATCAT TGGTTTAGGA TGTGAAGATA AATATCATAA CGTTCCAGGA 300 

GCCGAAGAAT ATACACATAG TATTCAAACA CTCTCAAAGG CTCGGGATAC TTTCCATAGT 360 

55 ATTAGTGAAC TACCAGAAGG TGCTAAAGTC GGTATCGTTG GTGCTGGATT AAGOGGCATA 420 

GAACTTGCCA GCGAATTAAG AGAAAGTAGA TCAGACTTGG AAATATATCT TTATGACCGT 480 

GGGCCGCGAA TTTTAAGAAA TTTTCCAGAA AAATTAAGTA AGTATGTTGC GAAATGGTTC 540 

GCCAAAAATA ATGTTACCGT TGTTCCAAAT TCAAATATTA ATAAAGTTGA ACCTGGTAAA 600 

ATATATAACT GTGATGAACC TAAAGATATT GATTTAGTTG TATGGACAGC AGGAATTCAA 660 

CCTGTTGAAG TTGTTCGTAA CTTGCCGATT GATATAAATA GTAATGGACG CGTGATAGTT 720 

45 

AACCAGTATC ATCAAGTACC AACATATCGT AACGTCTATG TAGTTGGTGA TTGTGCTGAT 780 

TTACCACATG CGCCAAGTGC TCAGTTAGCC GAAGTTCAAG GTGATCAAAT TGCCGATGTG 840 

CTTAAAAAGC AATGGCTAAA TGAACCATTA CCTGACAAAA TGCCGGAACT AAAGGTACAA 900 

50 

GGTATCGTTG 910 
(2) INFORMATION FOR SEQ ID NO: 116: 
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ATCAGACACA ACACCATGCT CTATATCAAT 
GCGTGCTTTT AAATAATCAT CATCAATTAA 
^ ATCCGCCAAG TATTGCGCAC GATCACTATA 

CAAGTAATCA ACAGATCTTG GACCCATAGA 
TATTTGAATT ACCGTGATAC CGCCAGAACT 

10 

TTTAAATGTT GCACTGATTG GCGCTTTAAT 
AGTGATTGTC CCACCACATG CTTTGACAAC 

,5 ATAAAATGCA TTAAACCCTT GTTCTCTTAA 
TACAATCCAA TCACCTTCAC GCCAATATTO 
ATGATACTTT GTCAATCGTG CGTGTTGCTG 

20 TGCATGACCT TCAATGGCTA GTTCAATTGC 
AGCATAACGC TTGTGAATAT AATCAAACAG 
ACCATGTGTA GTCATATCAA AAAATGATTT 

25 

TTTGTCTACA TGTTCAGGTG CTGTCTCACX3 
TTGCTCATAA TATAGCAAAT ACCCGCCACC 
ATTCAATGCC AGTTGAATTG CAATCACTGC 

30 

ATCCTTACCA ATTTTAGCCG CAAGAGGATG 
TTTTGTCTGT TTGTCATTTA AGTTAATGAC 

35 ATTTAAAACA TTATTGATTA ATGGCTTTTT 
TACCAGTATC GACAAGTGGT GTAATCGGTG 
ACT^T^ATAAA TTGATCCTGA TCTATCGCAT 

40 TCTTAAATAT ACCTTTTTTA ATATTTAGCA 
GAATAATGCT TAAATTTTTA TCCGACTTAA 

CCAACATATC AATTGAATGA TTTCTAAGTT 

45 

ACTTCAATGT AATATTTTTA ATTTTAGCTG 

ATCCTGGATT ACGTTGAAAC GTTGCTTGAT 

CACTTGCATA CAGCGCATTT TTCCCATCTG 

SO 

ATCCTTTTGG ATATTCTTGA TTTACTTGAT 
AAGAGTGTGT TAAGTAATTT ACCTCTCGAG 

55 



ATTTGCTTTA TTGCTATCAA TGAGCGTACT 1620 

TGACTGTACA GGCACCTCAT GAAAATTATC 1680 

TGCTAAATGC ATCGCTTGTA TCAAATGATG 1740 

TGGTAAATCG ACATGTTCTA ATAACTTCAA 1800 

AGATGGTCCC ATTGaATAAA TGTCATAGTC 1860 

CTGAATGTCA TATTTGGCTA GATCCTCTAA 1920 

ATTGACTAAT TGTTTCGCAA TGTCACCTTT 1980 

TATTTGAAAT GTCTTACCTA ATTCGGGTTG 2040 

ATTTTCATGC GTAAATACTT GTGCCGTTTC 2100 

GCGCGAATAT TTTTCAGTAG CCCAATTGGC 2160 

AGGATTAATT AAATCTTCCA ATGACAATTT 2220 

CTTTGGAATT GCTGGCACAG CGACAGITTT 2280 

ATATTCGCCT GAATCATCTA GATAAAATTG 2340 

TGCATCAAAC GCAGTTATAC TGCCAGTACT 2400 

ACCAATACCT GATGCAAATG GTTCTACCAC 2460 

ATCCATGGCG TTGCCACCTT GATCTAATAC 2520 

TGATACGGAA ATTAACCCTT CTTTAGATGT 2580 

CATACTATAT CCTCCTACTT TCTGTTAAAT 2640 

CTACTTTTTC TAAATCTTGA CGTTGCTOGT 2700 

ATGCAATTTT AAATTTATCG CCACGATAAA 2760 

TAACTACTGC TTGTCTCAAG TTTGGATGCG 2820 

TTAAAAAGAC TGACTTGCGT CCATTTTTGC 2880 

TTAAATCAAA ATGTTTTTGA TTCACATCTG 2940 

CTGACAATGC ATTATTCGGG TCACCATTAA 3 000 

GTCCATAACT ACCTTTTTCT GTTTCGTTGA 3060 

ATGCATTTTT CTGTGTCATA ATGTATGCGC 3120 

AATTTGCAGG AATTGTACTG CTATCCCCAT 3180 

TAACAAATTT TTTAGATAAA ATGCCTGCCG 3240 

GCATCGATTG ATCTGTCGTA ATTTTAACAA 3300 
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TATAAGCTTT AATCAACTTA TCATAGATTG 


ATTTATCGTC 


CTTGTCTTTC 


TCTTTACGCA 


3420 




ACTGATCGAT GTCCTCATCT TTTAATATCT 


TGATGTCATT 


TATATGTTTG 


TGCATATTGT 


3480 


5 


AAGTATTATT GTTAGGCACA GACTTTTTAT 


CACGTGCTCT ATCTAAAGAA 


AACTTAACAT 


3540 




CTTCAGCCGA TACACGCTCT CCAGTATTAC 


GTGCTTGTCC 


ATTGACCACT 


TTCGCAAAAT 


3600 


10 


AATCATCATC TCTTAACAAG AAATAAAATG 


CTTTATTGTC 


CTTATTCACA 


GCATAATCAT 


3660 


GACTTAACGA ACCTTTCGTT GTTAAATGAT 


CATTTTCATC 


TAATAATAAT 


AACCTTGTGT 


3720 




ACATATTCAT ATTAATTGAA TATACTGACG 


GCGCAATTGA ACGTATTGGA 


TCCAATGTAG 


3780 


IS 


GAATTTCACC ATCTTGTTGT GTCATCACAA 


GTGGCC6CGT ATCTCGTTCT 


CTACTATTGT 


3840 




TGTAATCAAA TTGTTGCCAT ATTAATGCAC 


GTGAATTTGG 


CAATCCAACA 


CTATTTTTAT 


3900 




CTAACACTTT ATTGTCATAT ACTAAATTCT 


TTTTTGATCC ATATAAAGGC 


GCCATATACC 


3960 


20 


CTTTATCAAA TACAACTTCA TCTTCAATTT 


GCTTATATGT 


TTGTTTAACA 


TCTGCTTCAT 


4020 




TTTGAGTAGA AGCTTTATTT AACAACTGGT 


CTACATGTTT 


ATCTTTCAAT 


AAACTATTTG 


4080 




ATCCTGTAGA ACTAAATAAT GCCGTCATAG 


CATAGTTCGG 


GTCACCAAAC 


ACTGTCATCC 


4140 


2S 


AGTCATCAAT TTGGATATCA TAATTGCCGG 


CTTGACGTTG 


TGTACGATAG 


CTACCATAAT 


4200 




CTGGTTGGAT ATTCATCTTC ACGTTAAATC 


CTGCATTTTC 


CAATTGATCT 


TTAACGATAT 


4260 


30 


TCATATCATT TTCATAACTT GCTTGTCCTA 


GGAAATGTAT 


TGTTGGTCGC 


TCGCCTTTCA 


4320 


CTTCAACTTT CGATGACTTT TGAGCCACTT 


CTGATTTCGT 


AGGGACACCA 


CAACCACTTA 


4380 




ATACCAAOGC TAAAACTATA ATTGCGATAC 


TAATGATTTT 


CTTCACATCT 


ATCCCTACCT 


4440 


3S 


TTTTAATGAA TTCTTGGATC TAGTGCATCA 


CGCACTGCAT 


CACCTATAAA 


ATTAAATGCT 


4500 




AAAACGACGA ACATAATACA AACACCAGGT 


ACAATAGCTA AATTACTGTG 


CGTTTCCAAG 


4560 




TAGKACTAC CGGTACGTAA AATGTTGCCC 


CATTCAGCTA 


CATCAGGTGC 


AACACCAAGT 


4620 


40 


CCTAGGAAAC TTAAACTACT TGTTGTTAAT 


ACAACCACAC 


CTATATTTAA 


TGAAAAACGT 


4680 




ACAATCATAG GCGCAATCGC ATTCGGTAAA 


ATATAACGCC 


ATATGATATT 


CCAAGTGTTT 


4740 




TCACCAGTGA TACGTGCTGC ATCTACATAT 


TCCATGCGTT 


TAATTTCTAA 


AACACTGGCA 


4800 


45 


CGCATTGTCC GTGCAAATGA TGGTATATTA 


CCGATACTTA 


AAGCAATAAT 


TAAATTTGGA 


4860 




ATACTTGCTC CAAATGATGC AATAATTGCC 


ACCGCTAACA 


ATAATGATGG 


AATTGCAAAC 


4920 


SO 


ACTACATCTA AAATTCGCAT TATTAAATTA 


TCAATATGAT 


TAAAATAACC 


TGCGATAGTG 


4980 


CCTAGTAACA CACCAAAAAT AACTGCAATA 


ACTACTGAAA 


TAATTGAAAT 


TGAAAATGTC 


5040 




AGCTTCGTTC CTACAACTAC GCGTGTAAAT 


AAGTCTCTAC 


C6AAATCATC 


AGTACCAAAC 


5100 



ss 



661 




662 



CTTGTAAAAT AATCTTGAGT AGATTACTAT 
ATTTGTGaAT AGGGAGGCAC AACATCATGT 
^ TACAATTCAA TTATGATGAA ACTACAGTTC 

GAAAAAAACA TATCCTAGGT ATTGTTGGTG 
AATCTATTTT AGGGCTACTA CCAGATTATC 

10 

TTAATGGGCA ATCGTTAAAT AATTTATCAA 
ATATTTCAAT GATTTTTCAA GATCCACTCT 
AACAAATTAC AGAAGTAATA TTTCAACATA 
7GACAATAGA CATTTTAGAA AAAGTAGGTA 
ATCCACATGA ACTTTCTGGT GGTATGCGTC 
20 TAAAGCCACA AATTTTAATC GCAGATGAaC 

ATCAATTACT GCAGTTAATG AAGTCCCTTT 
TCACTCACGA TTTAGGCGCT GTGTATCAAT 
GAAGTGTCGT TGAAAGTGGC ACGGTTGAAA 
CAAAACGCTT AATAGATGCG ATTCCTGATA 
ACAATGATAT TTTATTAAAA TTCGATCGCG 

30 

CCTATACCGA GCAGTTAATG ATATTAACTT 
TGTCGGTGAA TCAGGGTCAG GGAAATCGAC 
AGTGTCAGAA GGCTTTATTT GGTATAACGA 

35 

ATTGAAATCT TTACGACAAG AGATACAAAT 
TCCAkGATTT AAAGTCATTG ATGTGATTAA 

40 AGATAATGAT GACATTATTA AAACTGTCGT 

AACTTTCTTA TATCGCTATC CACACGAATT 
CGCGAGAGCA CTTGCTGTTG AACCTAAAGT 

^ AGACGTTTCA ATTCAAAAAG ATATCATCGA 

CATCACTTAT TTATTCATCA CACATGACAT 
TGCAGTTATG AAAAATGGCG AAATCGTTGA 

SO 

TCCGCAGTCA GACTATGCAA AGCAACTTAT 
GTCATGC3GTT GTGCAACTTT ATCACTGTAT 

55 
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GATATACAAA AGTATAGAAT AAATTTACAC 7020 

CAAATTTATT AGAAGTCAAC AGTCTGAATG 7080 

AAGCGGTAAA AAACGTCTCT TTCGAATTAC 7140 

AATCAGGATC AGGAAAAAGT ATTACCGCTA 7200 

CAGATCACAC ATTAACAGGA GAAATTATTT 7260 

CTTCAGCGTT AC7UVCAAATT CGAGGTAAGG 7320 

CTTCGTTGAA TCCAAGATTA ACGATTGGCA 7380 

AACGTGTATC TAAATCTGAA GCAAAGTCGA 7440 

TAAAACATGC AACTCGACAA TTTGATGCTT 7500 

AACGTGTCAT GATAGCAATG GCATTGATTT 7560 

CAACAACGGC ATTAGATGCC AGTACACAAA 7620 

ATGAGTACAC AGAAACATCT ATTATTTTTA 7680 

TTTGCGACGA TGTGATTGTA ATGAAAGATG 7740 

GTATTTTTAA ATCGCCACAA CATACCTATA 7800 

TTCATCAAAC GCGTCCGCCA AGACCGTTAA 7860 

TGAGyGgGAT TACACATCAC CGAGTGGCAG 7920 

GGCTATTAGA AAAGGCGAAA CATTAGGCAT 7980 

ATTAGCTAAG ACGGTCGTCG GTCTAAAGGA 8040 

ATTACCATTA AGTTTATTTA AAGATGATGA 8100 

GATTTTTCAA GATCCATTCG CATCTATTAA 8160 

ACGACCACTA ATCATTCATG GGAAAGTCAA 8220 

ATCGTTGTTA GAAAAGGTTG GCCTAGATCA 8280 

ATCTGGTGGG CAACGTCAGC GTGTAAGTAT 8340 

GATTGTTTGC GACGAGGCAG TGTCCGCTTT 8400 

GTTATTAAAA CAATTACAGT TAGACTTC^SG 8460 

GGGTGTTATC AATGAAATAT GTGATCGCGT 8520 

ACTGAATAAC ACAGAAGATA TTATCAAACA 8580 

TTCAGAAGTA GCAGTTATTG CTAAATAAAA 8640 

GGTCTGAAAT AAATTGCXSCG ACTTCTGATG 8700 



663 



EP0 786 519 A2 



5 



70 



20 



35 



40 



TATCAAGTTT 


TAGGTGCTTT 


GCCATGATTT 


AAGAGTCACC CCCATACTTT 


GGGCATTTTA 


8820 


ACGCCAGAAT 


AAATCCCCCG 


CCACTATGTG 


AAGTGTGGGG GATTATTTAT 


ATTTTATTAG 


8880 


AATATTCAGA 


TTTTTGAGTG 


TGTCAACTTA 


GCTTAGTCAA TGTATATTTA ACGTCACTTA 


8940 


CTCTTTTTCT 


TTCATAATTA 


ACACATTCAA 


ATAAACTTTG ATCAAAAAAC 


ACAAAGTTAA 


9000 


AAGTACCATC 


TTGTAATATG 


CTCTCATACA 


TTATCCCGTC ATATTTAAGG 


CTTCGAATAT 


9060 


AATCA6CTAA 


ATATTGAAAT 


GGCAAATAAT 


CTATTCCTTG TTCATCGCTT GGATTTGTTA 


9120 


TTCCTTTATG 


AATCTTTTTT 


AATGTTTGGT 


AATTTACAAA ATACTTTCTA AATCCATCAT 


9180 


CGCCAGCTTT 


GATTGCATTA 


CTAGTTAAAT 


TAGTTAAATT CGCAATTTTC 


AATTTCTCTT 


9240 


TTGTCACGTT 


TTTTTGTAAC 


TTAACCTTAC 


CTATATAAAT AATGTCATTA TGCTTAGGTT 


9300 


TAACTTCTTC 


TATACTGACC 


TGTTCTTTTG 


TACTAAGGTA TAATAC6CTT 


ATCCATTTAG 


9360 


AATTCAATCT 


TCCTGCCGTT 


GCAAATCCCT 


TTGGTGGTGA CATTAGTTCA 


CTTTTCTCTG 


9420 


TAATGAACTT 


AACTATTCTA 


GATCTATATA 


ATGGTTCAAA TCTTTCTCTA 


AATTCCTCAA 


9480 


TACTATAGTA 


ATTAGTAGTG 


ATATCGAGAA 


AGAACGCTAA ATTCTCTAAA 


TTGATCATAT 


9540 


TTTTATGAAA 


TCTAriTlTA 


TACTTCAAGC 


TCTCACAAAA TCCATCCCAG 


TCATTATTTG 


9600 


CTACAATTAG 


ATTTTTATTT 


GTATATTTTT 


TATCGTTTAT GATTTTAGCG 


CCTACTAAAT 


9660 


CTTCCAACAC 


TCGTCTATCT 


AAATTTTCAT 


CATCTTTAAA AAOTTCATTT AAAATACAAC 


9720 


TTATTTGAGC 


TTCCTCAACA 


TTAAATATAC 


TCCAGTCGTC TTTTAATGCT ATTTCAATCT 


9780 


TTTTACCTTC 


TTTTGGGCTA 


AAAGTATCTG 


GTAAATTTAT ACTAATATCA 


TATAATTCTA 


9840 


ATGCTGGTCT 


TAAATAATCT 


CTAATAAGTT 


CTAATTTATC TATGTCCTTA 


GTCGTATCAA 


9900 


ATATTTTAAC 


ACCAAGATGA 


TTGTTATCAA 


TATCACAATT GTCAAATTTG 


CTATTTATCA 


9960 


TTTGCAATGA 


TTTCTACXaiT 


TTCAGTATTA 


TTAAAACATT TTTCACATAT 


TTTCATTTTG 


10020 


AGACtCCAAG 


TATCTATTCA 


TAATTTCTAG 


GTGATGCATG ATAGATAACC 


TTTTAATTAA 


10080 


ACCTAATCCT 


GGATaCTTAT 


TATTTTCATT 


TAATTCTTCA AATTGTCCCA 


AGCGCATAAG 


10140 


ATCTATTTTT 


AATATCTAAG 


TTTTTTGACC 


ATGTTACTAA TT 




10182 



45 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3491 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



55 
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AACTCAGGCA ATTGAAACAG CATTAGGTGC 

AAAAGATGGA CGCCAGGCTA TTCAATTTTT 

5 

TTTACCATTA AATGTTATAC AGAGTAGAGT 

AGAGGCAAAC GGATTTATTA GTATCGCTTC 

AAATATTATC GGGAATTTAT TAGGTAATAC 

10 

TGAATTGGCA CGTGCGATTA AATATCGAAC 
AAATCCTGGT GGtTCTATGA CTGGTGGTGG 
AAAAGACGAG TTGACAACAA TGAGACACCA 
ATTTGAACAA CAATTTAAAG AGTTGAAGAT 
T6AAAAAAGT CAAAAGCATA ATACACTTAA 
20 CGATA6ATTA ACTACACAAG AAACACAAAT 

AAAAAATGAT GGTTATACGA GTGACAAAAG 
TCTAGAAAGT ATTAAAGCAT CTTTAAAACG 
ACTTTCTAAA GAAGGTAAGG AAAGCGTTAC 
ATCTGATCTT GCTGTGGTTA AAGAGCGTAT 
AAATAATCAA AATCAACAAA CTAAACATCA 

30 

CTTTAATTCG GATGAAGTGA TGGGCGAACA 
TGGTCAACAA GAAACX3AGAA CAOGCTTATC 

^ TATTGAGTTG AATGAACAAA TCGATGCGCA 

TATTTTAGCT ATCGAAAATC ACTACCAAGA 
ATTMTTCAT CATGCGATAG ATCATTaAAT 

40 ArATCTGAAT ATACGaGTGA TGrATCGATg 
AGaTGyCGAT TGATGrACTA GGTCCTGTAA 
TAAATGAACG TTATACATTT TTAAGTGAAC 

^ CATTAGAGCA AATTATAAGT GAAATGGATC 

TCCATGCTAT TCAAGGACAT TTTACAGCTG 
CAGAATTGCA ATTAACTGAA GCCGATTATT 

SO 

CACCGGGTAA AAAGTTGCAA CATTTATCGT 
CTATTGCTTT ACTATTTGCA ATTTTAAAAG 

55 



TTCATTACAA CATGTCATTG TAGATTCAGA 60 

AAAAGAACGT AATTTAGGTC GTGCGACGTT 120 

GGTAGCGACT GATATTAAAT CTATTGCTAA 180 

GGAAGCAGTT AAAGTAGCAC CAGAATATCA 240 

GATTATCGTT GATCATTTAA AGCATGCAAA 300 

TCGTATTGTT ACTTTGGAAG GTGATATTGT 360 

CGCTCGTAAG TCAAAAAGTA TTCTGTCTCA 420 

ATTAGAAGAT TACTTGCGTC AAACAGAATC 480 

AAAAAGTGAT CAATTAAGTG AACTGTATTT 540 

AGAGCAAGTG CATCATTTTG AAATX3GAGCT 600 

AAAAAATGAT CATGaAGAAT TCGAATTTGA 660 

TCGACAAACT TTGAGTGAAA AAGAAACTTA 720 

ACTAGAAGAT GAAATTGAAC GCTACACAAA 780 

TAAAACACAA CAAACCTTAC ATCAGAAACA 840 

TAAAACACAA CAACAGACAA TAGATCGATT 900 

ATTAAAAGAT GTTAAAGAAA AAATTGGATT 960 

AGCTTTTCAA AATATTAAAG ATCAAATTAA 1020 

AGATGAATTA QATAAATTGA AACAACAACG 1080 

AGAAGCTAAA CTACAAGTTT GTCACCAAGA 1140 

TATTAAAGCT GAACAATCAA AGCTAGATGT 1200 

GATGrATATC AATTGACTGT TGAACGTGCG 1260 

ACGCATTACG TAAAAAAGTT AAGTTAATGr 1320 

ACTTAAATGC AATTGAACAA TTTGAAGAGT 1380 

AACGTACAGA TCTTCGTAAA GCTAAAGAAA 1440 

AAGAGGTTAC TGAAAGATTT AAAGAAACTT 1500 

TGTTCAAACA ATTGTTTGGT GGAGGCGATG 1560 

TAACAGCTGG TATTGATATT GTGGtACAAC 1620 

TACTGAGTGG TGGTGAGCGT GCATTAACTG 1680 

TAAGATCTGC ACCTTTTGTT ATATTAGrTG 1740 
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10 



TATCAGACGA AACACAATTC ATTGTTATTA CACACCGTAA AGGAACAATG GAATTTGCAG 1860 

ATAGGTTATA CGGTGTAACA ATGCAAGAAT CAGGTGTTAC TAAACTTGTG AGTGTGAATT 1920 

TAAATACAAT AGATGATGTG TTGAAGGAGG AGCAATAATG AGCTTTTTTA AACGCTTAAA 19B0 

AGATAAGTTT GCAACAAATA AAGAAAATGA AGAAGTTAAA TCCTTAACAG AAGAACAAGG 204 0 

TCAAGACAAA TTAGAAGATA CACATTCTGA AGGTTCAACG CAGGACGCAA ATGATTTAGC 2100 

AGAAAATGCT GAAGTGAAAA AGAAGCCACG CAAGTTGAGT GAAGCGGATT TTGATGACGA 2160 

TGGCTTAATA TCAATTGAAG ATTTTGAAGA AATTGAAGCT CAAAAAATGG GTGCTAAATT 2220 

,5 TAAAGCAGGA CTC6AAAAAT CTCGTCAAAA TTTCCAAGAA CAATTAAATA ATTTGATAGC 2280 

GAGATATOGT AAAGTAGATG AA6ACTTTTT TGAAGCTTTA GAAGAAATGT TAATCACTGC 2340 

AGACGTCGGT TTTAATACAG TGATGACGTT AACTGAAGAA TTACGTATGG AAGCACAACG 2400 

20 ACGTAATATT CAAGATACTG AAGATTTGCG TGAAGTCATT GTTGAAAAGA TCGTAGAGAT 2460 

TTACCATCAA GAAGATkATA ATTCAGAAGC TATGAACTTA GAAGATGGTC GTTTAAATGT 2520 

CATTTTAATG GTTGGTGTGA ATGGTGTTGG TAAAACAACA ACAATTGGAA AATTAGCTTA 2580 

CCGATATAAA ATGGAAGGTA AAAAAGTAAT GTTAGCTGCG GGCGATACTT TTAGAGCGGG 2640 

TGCTATTGAT CAATTGAAAG TTTGGGGCGA ACGTGTTGGT GTAGACGTAA TTAGCCAAAG 27C0 

TGAAGGTTCT GATCCAGCTG CTGTTATGTA TGATGCgATT AATGCCX3CTA AAAACAAAGG 2760 

TGTTGATATT TTAATCTGTG ATACCGCTGG ACGTTTACAA AATAAmACAA ATCTAATGCm 2820 

AGAATTAGAA AAAGTTAAGC GTGTAATTAA TCGAGCAGTG CCAGATGCGC CTCATGAAGC 2880 

ATTACTATGT TTAGATGCTA CAACTGGTCA GAATGCGTTG TCACAAGCTA GAAACTTTAA 2940 

AGAAGTAACA AATGTTACAG GTATTGTATT AACXSAAATTA GATGGTACAG CCAAAGGTGG 3000 

TATGGTATTA GCCATTCGTA ATGAATTGCA CATCCCAGTT AAATATGTAG GTTTAGGTGA 3060 

40 GCAA^AGAT GACTTACAAC CATTTAACCC TGAAAGTTAT GTCTACGGCT TATTCGCTGA 3120 

TATGATTGAA CAAAATGAAG AAATAACAAC AGTTGAAAAT GATCAAATTG TAACAGAAGA 3180 

AAAGGACGAT AATCATGGGT CAAAATGATT TAGTtAAAAC GTTACGAATG AATTATTTGT 3240 

TTGATTTTaT CAATCCTTAT TGACGAATAA ACAACGTaAT TATTTGGAAT TATTTTATCT 3300 

TGAAGATTAT TCTTTAAGTG AAATCGCAGa TACTTTTAAT GTGAGTAGaC AAGCAGTTTA 3360 

TGATAATATA AGAAGAACTG GCGATTTAGT TGAAGATTAT GAAAAGAAAT TGGAATTATA 3420 

CCAGAAATTT GAGCAACGCC GAGAAATATA TGATGAAATG AAACCACATT TAAGTAATCC 3480 

AGAACAAATA C 3431 

55 



25 



30 



35 



45 



50 
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(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 4253 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 
AGTACGTTTT ATAATTATAA GTACGTAATT AACATATTAA CATATCGCAA GTATGTATTT 
AAATAAgAlT GTTATAATTT CAAAGTTCAT CCAAGaTTAT GGCGTTTGCA TTTACCTATT 
AAAAAOGTTA TTATATCAAA GATGCGAAAG ATAATACGGG TTTATTTTAT GAAAGTGAGA 
AGGATAAAAT GGATAATGAG CAACGCTTAA AAAGAAGAGA GAATATAAGG AATTTCTCGA 
TTATAGCACA TATTGACCAC GGAAAATCTA CATTGGCTGA TAGAATTTTA GAAAATACCA 
AATCAGTTGA AACAAGAGAT ATGCAAGATC AGTTACTAGA TTCAATGGAT TTAGAAAQAG 
AACGTGGTAT TACAATCAAA TTAAACGCgT ACGTTTAAAG TACGAAGCTA AAGATGGAAA 
TACTTATACA TTCCATTTAA TCGATACGCC TGGACACGTC GATTTTACAT ATGAAGTGTC 
ACGTTclTTG GCAGCTTGTG AGGGCGCGAT TTTAGTAGTA GATGCGGCTC AAGGTATCGA 
AGCACAAACA TTAGCAAATG TTTATTTAGC ATTAGATAAT GAGTTAGAGT TAITGCCTGT 
TATTAACAAA ATTGATTTAC CTGCTGCAGA ACCTGAACGC GTGAAACAAG AAATTGAAGA 
TATGATAGGT TTAGACCAAG ACGATGTTGT TTTAGCAAGT GCTAAATCTA ACATTGGAAT 
TGAAGAGATA CTAGAGAAAA TAGTTGAAGT TGTGCCAGCT CCAGATGGTG ACCCAGAAGC 
ACCACTAAAA GCGTTAATAT TTGATTCTGA GTATGATCCA TATAGAGGGG TAATTTCATC 
GATAAGAATT GTGGACGGTG TTGTTAAAGC CGGAGATAAA ATTCGAATGA TGGCCACTGG 
TAAAGAGTTC GAAGTAACAG AAGTTGGAAT TAATACACCT AAGCAGCTTC CAGTTGATGA 
ATTXACAGTT GGTGATGTTG GTTATATTAT TGCAAGTATT AAAAATGTTG ATGATTCTAG 

ggttggtgac accatcacat tagctagtag acctgcatca gaaccattgc aaggttataa 
gaaaatgaat ccaatggtat attgcggact gttcccaata gataacaaaa attataatga 

TTTAAGAGAA GCATTAGAAA AATTACAATT GAATGATGCA TCATTAGAAT TTGAGCCTGA 
ATCGTCACAA GCATTAGGTT TTGGTTATAG AACTGGTTTC TTAGGTATGT TACACATGGA 
AATAATTCAA GAAAGAATTG AAAGAGAATT TGGTATTGAA TTAATTGCAA CTGCACCATC 
TGTAATTTAT CAATGTGTTT TAAGGGACGG TTCAGAAGTG ACGGTTGATA ACCCAGCACA 
AATGCCAGAT CGTGATAAAA TTGATAAAAT ATTTGAGCCA TATGTTCGTG CAaCTATGAT 
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TATAAATATG GACTATTTAG ATGATATTCG TGTAAATATT GTTTATGAAT TACCTTTAGC 1560 

TGAAGTTGTA TTTGATTTCT TCGATCAACT TAAATCTAAT ACTAAAGGAT ATGCATCATT 1620 

5 

TGATTATGAA TTCATCGAAA ATAAAGAAAG TAATTTAGTC AAGATGGATA TTTTATTAAA 1680 

TGGTGATAAA GTGGATGCGC TAAGCTTCAT AGTTCATAGA GATTTTGCAT ATGAACGTGG 1740 

TAAAGCATTA GTTGAAAAAC TTAAAAC6TT AATTCCAAGA CAGCAATTTG AAGTACCTGT 1800 

ACAGGCTGCA ATAGGACAAA AAATTGTAGC GCGTACAAAT ATTAAATCAA TGGGTAAAAA I860 

CGTTTTAGCT AAAT6TTATG GCGGTGACAT AAGCCXTTAAA CGTAAATTAC TTGAAAAACA 1920 

IS AAAAGCAGGT AAAGCTAAGA TGAAAGCAOT TGGTAATGTT GAAATTCCAC AAGATGCTTT 1980 

CTTGGCTGTA TTGAAAATGG ATGATGAATA ATTTTAAAAA ATCAATTAAC AATTTACAAT 2040 

GAATAAAGTT TAATAACTAA AAAGAGGGAG CCTAGGATAA ATTAACGTCC TGGGCTTTAC 2100 

AATGTTATAT TGGCAGCCAT CGACAGAGTT AAAATGAGCT TATAACAATG GGGCCCCAAC 2160 

ACAGAAGCTG ACGAAAAGTC AGCTTACTAT AATGTGCAAG TTGGGGTGGG GCCCCAACAT 2220 

AGAGAATTTC GAAAAGAAAT TCTACAGGCA ATGCAAGTTG GGGTGGGACG ACGAAATAAA 2280 

TTTTGCGAAA ATATCATTTC TGTCCCACTC CCTTATGCAT GAGTTTTACT CATGTAATTT 2340 

TATTTTTAAG GACATATTAC ATCTGGCTAA TGTGTAAGAG CCACTACATA ATAAATCATT 2400 

AGTGGTTCTT TATTATTTCT ATCTCACTCC CTCTAAACAA GAATAAATAT TAAAATGAAT 2460 

30 

CX3ATATATTA GACAATCATT GATTAAACGT TAAAGTTAAA AGTAAGAATA ATTGCAGATA 2520 

GTCCAACAGG ATATAGCCGA TTGGATAAAA AGTCTGAGAA GCGGGGCATT AAAATGACX3G 2580 

3S TACAAAGTGC ATATATACAT ATTCCATTTT GTGTAAGAAT ATGTACATAT TGTGATTTCA 2640 

ATAAATATTT TATACAGAAT CAACCTGTAG ATGAGTACTT AGATGCACTA ATCACAGAAA 2700 

TGTCTACAGC AAAATATAGG ATCTTAAAGA CCATGTATGT AGGTGGCGGC ACACCAACGG 2760 

40 CCCTTTCTAT TAATCaGTTG GAAAGATTAC TTAAAGCAAT ACGTGATACG TTTACAATCA 2820 

CAGGCGAGTA TACATTTGAA GCAAATCCTG ATGAGTTAAC TAAAGAGAAA GTCCAACTAT 2880 

TAGAGAAATA TGGAGTAAAA AGGATTTCAA TGGGCGTTCA AACATTCAAG CCGGAGTTAT 2940 

45 

TGTCTGTTTT AGGTAGAACG CACAATACTG AAGATATTTA CACTTCGGTG TTAAATGCTA 3000 

AAAACGCAGG TATTAAATCA ATCAGTTTAG ATTTAATGTA TCATTTACCG AAACAGACGA 3060 

TTGAAGATTT TGAACAAAGT TTAGATCTAG CTTTAGATAT GGATATTCAA CATATTTCGA 3120 

SO 

GTTACGGCTT AATACTTGAA CCTAAAACCC AATTTTATAA TATGTATAGA AAAGGCTTGC 3180 

TCAAACTTGC TAATGAGGAT TTAGGT6CTG ACATGTATCA GTTGCTGATG TCTAAGATAG 3240 

SS 
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AACATAATAA GGTTTACTGG TTTAATGAGG AATATTATGG ATTTGGAGCA GGTGCAAGTG 3360 

GTTATGTAGA TGGTGTGCGT TATACGAATA TCAATCCAGT GAATCATTAT ATCAAAGCTA 3420 

5 

TAAATAAAGA AAGTAAAGCA ATTTTAGTAT CAAATAAACC TTCTTTGACT GAGAGAATGG 3480 

AAGAAGAAAT GTTTCTTGGG TTGCGTTTAA ATGAAGGTGT GAGTAGTAGT AGGTTCAAAA 3540 

AGAAGTTTGA CCAATCTATT GAAAGTGTCT TTGGTCAAAC AATAAATAAT TTAAAAGAGA 3600 

AGGAATTAAT TGTAGAAAAG AACGATGTGA TTGCACTTAC AAATAGAGGG AAAGTCATAG 3660 

GTAATGAGGT TTTTGAAGCT TTCCTAATAA ATGATTAAAA AAAATTGAAA TTTCGAGTCT 3720 

IS TTAACATTGA CTTACTTTGA CCAATTTGAT AAATTATAAT TAGCACTTGA GATAAGTGAG 3780 

TGCTAATGAG GTGAAAACAT GATTACAGAT AGGCAATTGA GTATATTAAA CGCAATTGTT 3840 

GAGGATTATG TTGATTTTGG ACAACCCGTT GGTTCTAAAA CACTAATTGA GCGACATAAC 3900 

on 

TTGAATGTTA GTCCTGCTAC AATTAGAAAT GAGATGAAAC AGCTTGAAGA TTTAAACTAT 3960 
ATCGAGAAGA CACATAGTTC TTCAGGGCGT TCX3CCATCAC AATTAGGTTT TAGGTATTAT 4 020 

GTCAATCGTT TACTTGAACA AACATCTCAT CAAAAAACAA ATAAATTAAG ACGATTAAAT 4080 

25 

CAATTGTTAG TTGAGAATCA ATATGATGTA TCATCAGCAT TGACATATTT TGCAGATGAA 4140 
TTATCAAATA TATCTCAATA TACAACTTTA GTTGTTCATC CTAATCATAA ACAAGATATT 4200 
ATCAATAATG TACACTTGAT TCGTGCTAAT CCTAATTTAG TTATAATGGT TAT 4253 

30 

(2> INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3395 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



^ "(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

TCCCTAATCG AACAAAATTA TGCGCATAAA CAAAGTAGAT TGATATAAAA TTCTTAATTA 60 
TCAGAATATA TTTACAAATC TGAATTTTAT TAGTATATTG GrTAGTrTTC ATAGAGGCAT 120 

45 

GACGGTaTTT GAGCAGGATT TTAAATCGGg ATTTTATAAT CGATTTAAGA GAGGCCACtT 180 
TGCTTGcACA TTAATACTGT CAATGGGAGG GGAATGTATA TQAGTrAAGC ACATCAATTA 240 
ATTCAAGAGG ATGAACATTA TTTTGCGAAA TCAGGACGTA TTAAATATTA TCCGTTAGTG 300 

SO 

ATTGATCATG GATATGGAGC AACATTGGTT GATATTGAGG GGAAGACATA TATCGATTTG 360 
TTATCGAGTG CGAGTTCTCA AAACGTAGGT CATGCACCTA GAGAAGTAAC AGAA6CGATA 420 
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GTACGTTTAG 


CTAAGAAGCT 


TTGTGAGATT 


GCACCTGGAG 


ATTTTGAAAA 


AAGAGTGACC 


540 




TTCGGATTAA 


CCGGATCAGA 


CGCAAATGAT 


GGCATCATTA 


AATTTGCCAG 


AGCATATACA 


600 


5 


GGGCGTCCTT 


ATATCATTAG 


TTTCACTAAT 


GCATATCATG 


GTTCAACTTT 


TGGCTCATTG 


660 




TCTATGTCAG 


CTATTAGTTT 


AAATATGCGC 


AAACATTATG 


GTCCGTTATT 


GAATGGTTTT 


720 


10 


TATCATATTC 


CGTTTCCAGA 


TAAATATCX3T 


GGTATGTACG 


AGCAGCCACA 


AGCTAATTCA 


780 




GTAGAAGAAT 


ATTTAGCACC 


CTTAAAAGAA 


ATGTTTGCGA 


AGTATGTACC 


TGCTGACGAA 


840 




GTAGCATGTA 


TTGTTATTGA 


AACGATACAA 


GGCGATGGTG 


GACTTTTAGA 


ACCAGTTCCA 


900 


IS 


GGGTATTTTG 


AAGCGTTAGA 


AAAGATTTGT 


CGTGAACATG 


GTATTTTAAT 


CGCTGTCGAT 


960 




GATATTCAAC 


AAGGTTTTGG 


GAGAACAGGT 


ACATGGAGTT 


CAGTCTCGCA 


TTTTAATTTT 


1020 




ACGCCTGATT 


TAATCACTTT 


OGGAAAATCC 


TTAGCAGGTG 


GTATGCCTAT 


GTCAGCAATT 


1080 


20 


GTTGGACGCA 


AAGAGATTAT 


6AATTGTTTA 


GAAGCACCAG 


CACATTTATT 


TACAACAGGT 


1140 




GCTAATCCAG 


TTAGTTGTGA 


AGCTGCATTA 


GCCACAATTC 


AAATGATTGA 


AGATCAGTCG 


1200 




CTTCTTCAGG 


CTAGTGCGGA 


AAAAGGGGAA 


TATGTTAGGA 


AACGAATGGA 


TCAATGGGTA 


1260 


2S 


TCTAAATACA 


ATAGTGTAGG 


CGATGTTAGA 


GGTAAAGGTC 


TGAGCATTGG 


TATTGATATT 


1320 




GTTTCCGACA 


AAAAACTCAA 


AACACGTGAT 


GCCAGTGCGG 


CACTTAAAAT 


TTGTAATTAC 


1380 


30 


TGCTTTGAGC 


ATGGCGTAGT 


TATTATAGCT 


GTAGCAGGAA 


ATGTGTTGCG 


ATTCCAACCG 


1440 


CCATrCGTAA 


TAACATATGA 


GCAATTAGAC 


ACGGCGTTAA 


ACACTATAGA 


AGATGCACTG 


1500 




ACTGCTTTGG 


AAGCAGGTAA 


CTTAGATCAA 


TATGACATAT 


CT6GACAAGG 


TTGGTAATAG 


1560 


3S 


CGATTATCTT 


AATATAAAAT 


AAAAAATCAT 


TTCCACATCT 


GGATGTTAAT 


CAGATGGGAA 


1620 




ATGATTTTTT 


TTATTTTTTA 


TTTTGGTGGG 


TGGTATTCAG 


CTACGTCATT 


TTTCTTAGAA 


1680 




TGTCTAAGTC 


CATAACTTAA 


ATATAGGATG 


ATACCAACAA 


TAAACCAAAT 


TAAAGTGTAT 


1740 


40 


AATTTCGCTT 


CGAATCCTAA 


TCCCCAGAAT 


ACTAGCAATA 


CTAAAACAAA 


TGTAATTGCT 


1800 




GGTAACACAG 


GATATAAAGG 


TAATTTAAAT 


GCAGGAATTG 


GTAGATCTTT 


ACCTTcACGC 


1860 




TTTCTCAAAC 


GATACATTGC 


TAATGAAACG 


AACATAAATG 


CAACAAGTGT 


ACCTGCTGAA 


1920 


45 


t\i, i.r\r\L 1 o i. o 






AT AGAAr CAA 


TTAAAACACC 


AATAATAGTA 


1980 




AGTATAACTA 


GTGCGCGATT 


AGGTAAATGT 


TTGTCGTTTA 


AGTGGCTTAA 


CCATGAAGGT 


2040 


SO 


AATAAGCCGT 


CACGTCCAAA 


TGAATAAAGT 


AAACGTGAGC 


CTGCTAACAT 


CATACCAATT 


2100 


AATGCTGTAA 


ACATACCGAT 


AACAGAGATA 


GCTTGAACAA 


TAGCTGCTAC 


AACACCATGA 


2160 




CCACTTTGAC 


GTAAAGCCCA 


ACCAACAGQT 


TCAGCATTGT 


TTGCGTATTG 


TGAGTAATGG 


2220 
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CCAAGAATAC CTCTAGGCAT TGTC TT T TG A GGATCAAGTG CTTCTGCTGA GTTTGCTGCG 234 0 

ATAGAATCGA AACCGATATA CGCTAAGAAA ATCATTGAAA CACCAGCATA TATGCCTTGC 24 00 

5 

CATCCACCAA AGTCACCTGT AGCAGTTACT TTGTGTTCTG GAATAAATGG CACATAGTTA 2460 

CTAACATTTA TTGCTGTTAA ACCTACGATG ACAAATAAAA TAATAGCTAA TACTTTTAAA 2520 

ATAACTAAAA TATTTTCCAT ACGAGCTGCT TCCGACATAC CACGTGATAG TAATAATGCA 2580 

10 

GTTAATAAAA TAACGATAGC AGCAATAATA TC6ATAAAAC CX3CCATTTGT ACCAAATGGA 2640 

TTTGATAATG CTGCAGGTAA TTCGATGCCA ATTGGTTTCA CAAGTCCGCG TAAATTCGCT 2700 

GAGAATCCTG ATGCAACAAA GGCTACGGCG ATAAAATATT CAGCTAATAG AGCCCAACCG 2760 

GCAACCCATC CAAAAAATTC ACCAAATAAT ACATTGACCC AAGAATAGGC TGAACCTGCyV 2820 

AATGGCATAG CGGCAGCCAT TTCTGCATAA GTAAATGCAA CTAAACCAGC AACAATAGCA 2880 

20 GCGAGTAAGA ATGATAACGC AACX«3CCGGT CCT6CATGTT CTGCAGCAAC AATGCCAGGT 2940 

AGCGTAAAGA TAGATGTCGA TACAATTGTT CCTACACCTA AAGCTAAGAA ATCACGCACC 3000 

CGAAGTGTAC GCTTTAAATG ACCATCTTTA TTTTGATAGA TAGCCGGATC CTCTTTTCGT 3060 

GCTATTTTAT TGAAAAAACT TCCCATAAAC TTTCCTCCCA AACATTCATA AACAATTCTA 3120 

TACGGTGTTT TTTAATATGT TATATCATAG CACAAATAAT CAATATTTTG TCTAAAAATT 3180 

CTGAAAAATC ACAACTTTAT GTTACGTATT AATGACTTGT CTTGATAACA TCCATAGATT 3240 

30 

TTTTAAATGA TAAAACTGAT TATAACAGAT ATTAAATGAA TAAGTACTAT TTTTTGCnAA 3300 
TTTTCTAACA ATTTTGCACA TTATATGTTT AAAATCAATT TCATGTTTAT GGTCTGATTG 3360 
GCTAGTGTGT ATGAAATGTA AnTCTTTGAC TniiGA 3395 

35 

(2) INFORMATION FOR SEQ ID NO: 120: 

- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13508 base pairs 
40 (B) TYPE: nucleic acid 

(C) 5TRANDEDNESS: double 

(D) TOPOLOGY: linear 



^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

ATCAGGTAAT GCCATGCGTT TAGCTGAAAA TTTTTTCAGA ACGTTTAAGT GATATCGGAC 60 

ATCAAGTTGT TTTGATGTCA ATGGATGAAT ATGATACGAC AAACATCGCG CAGTTAGAAG 120 

SO 

ATTTATTTAT TATTACGTCT ACTCATGGTG AAGGAGAACC GCCTGATAAT GCATGGGATT 180 

TCTTTGAATT TTTAGAAGAC GATAACGCAC CTAATTTAAA TCATGTGAGA TATTCAGTAC 240 
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TACTAGAAAA TCTAGGCGCT GAGCGTATAT GTAAGCGTGT AGATTGTGAT ATTGATTATG 360 

AAGAAGACGC AGAAAAGTGG ATGGCAGACA TCATTAATAT TATTGATACC ACATCAGAAG 420 

GTATTCAAAG TGAATCGGTG ATAAGTGAAT CAATTAAGTC TGCCAAAGAA AAGAAATATT 480 

CTAAATCAAA TCCATACCAA GCAGAAGTAT TAGCGAATAT CAATTTAAAT GGTACCGATT 54 0 

CAAATAAAGA AACACGACAT ATAGAATTTT TACTTGATGA TTTTAGTGAA TCATATGAAC 600 

CAGGAGATTG TATAGTAGCA TTACCGCAAA ACGACCCTGA ATTGGTTGAA AAACTAATAT 660 

CCATX3TTAGG TTGGGATCCG CAATCTCOGG TGCCAATTAA TGATCATGGT GATACAGTTC 720 

CTATTGTTGA AGCACTAACA TCACATTTTG AATTTACTAA ATTAACATTG CCATTATTGA 780 

AAAATGCAGA TATCTATTTT GACAATGAAG AATTATCTGA ACGTATTCAA GATGAGTCAT 840 

GGGCGCGTGA ATATGTTATA AATCGGGACT TTATAGATTT AATAACAGAT TTTCCAACTA 900 

20 TAGAATTACA ACCTGAGAAT ATGTATCAAA TCCTTAGAAA ATTACCACCA AGAGAGTATT 960 

CX5ATTTCTAG TAGTTTTATG GCAACGCcAG ATGAAGTGCA TATTACCGTT GGTACGGTTC 1020 

GTTATCAAGC ACATGGACGT GAGAGAAAAG GTGTATGCTC GGTTCATTTT GCTGAGCGAA 1080 

TTAAACCAGG CGATATAGTA CCAATTTATT TGAAGAAAAA TCCGAACTTC AAATTTCCGA 1140 

TGAAGCAAGA TATACCGGTT ATTATGATTG GACCAGGTAC TGrAATTGCT CCTTTTAGAG 1200 

CATATTTACA AGAACGTGAA GAACTTGGTA TGACTGGAAA AACATGGTTG TTCTTTGGTG 1260 

ATCAACACCG TAGTTCTGAC TTTTTATATG AAGAAGAAAT AGAAGAATGG CTTGAAAATG 1320 

GAAACTTAAC ACGCGTAGAT TTAGCATTTT CAAGAGACCA AGAACACAAA GAATATGTAC 1380 

AGCATCGTAT AATGGAAGAA AGTAAACGTT TCAATGAATG GATTGAGCAA GGCGCACAAT 1440 

CTATATTTGT GGCGATGAAA AATGTATGGC GAAAGATGTC CATCAAGCCA TTAAAGATGT 1500 

ATTGSTAAAA GAACGTCATA TTTCTCAAGA AGAAGCAGAG TTATTATTGC GACAAATGAA 1560 

ACAACAACAA CX3CTATCAAC GTGATGTTTA TTAGOGATTG GTGTTAAATA TTTTAAGGTG 1620 

TAATGATGTA AAAAGATATA AAGGATGTTG CTCAACATGA ATATGCCATT AATGATAGAT 1680 

TTAACAAATA AAAATGTCGT CATAGTTGGT GGAGGCGTCG TTGCAAGTCG TCGGGCACAA 1740 

ACATTAAATC AATACGTTGA ACATATGACG GTCATCAGTC CGACAATCAC TGAAAAACTT 1800 

CAAAATATGG TAGATAACGG TGTCXJTCATA TGGAAAGAAA AAGAATTTGA ACCAAGCGAT 1860 

ATTGTAGACG CGTATCTAGT TATTGCAGCA ACCAATGAGC CACGTGTCAA TGAAGCGGTA 1920 

AAAAAAGCCT TACCTGAGCA TGCCCTTTTT AATAATGTTG GAGATGCATC AAATGGCAAT 1980 

GTTGTATTTC CAAGTGCACT ACACCGCGAC AAGCTAACTA TCAGTGTATC AACTGATGGT 2040 
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TACAGTTCGT ATATCGACTT TTTATATACT TGCCGACAGA AAATAAAAGT ACTTGATATA 2160 

ACATATAACG AAAAGCAACA GTTACTGTCA CAAATTGTGT CACAAGAATA TTTAAATCAT 2220 

GACAAACAAG CTCAATTTTT AGCGTGGTTG GATGTAAGAT AATAATAGCG GACCGTCTAA 2280 

CCGTCTAAGG TAAGTCTTCT TATTTTAACT TTAACGCTTA ATCATTGAAA TTAAGACATG 2340 

GGCGGCTTTG TGAATAGTCT AATAATGAAG GATTTAAGCG ATAATGATAT GCX5TTTTAAA 2400 

TATGAATATT ACAATAGAQA AAAAGATACG TAGAACAAAC TTAATAAAAT AGGTGGATAA 2460 

ATTGAAATCT GGTTGAAGTC GTTACTATCA TAGCX3ACCTT TAGCCAGATT TTTTGTGCAA 2520 

TAGAAAGCAA TAATAAAAAT GATAGATCAA AATGAAATAC AGGACAGGAT ATACAAGGAT 2580 

TAGTCATGCC ATGTTATCAA GTAGGAAAAT CAAACTTCAC TATTGATAGT TACGCAAAAA 2640 

AGATTTTTTT GATAAAATGA GATAACTTAA ATATAAAAAA TTATATTAAT TATAATATTT 2700 

20 AAGTTAAAGA GGGGGATTAT GTAAATTGTA TTAAAAGTGG AGGGAGAAAA TAATATGAAT 2760 

AGTGATAATA TGTGGTTAAC AGTAATGGGG CTCATTATTA TTATTTCAAT TGTAGGTTTA 2820 

CTCATTGCCA AAAAGATAAA TCCAGTTGTA GGTATGACAA TCATACCTTG CTTAGGGGCA 2880 

ATGATTTTAG GATATAGTGT GACAGATTTG GTTGGATTTT TTGCTAAAGG GTTAGATCAA 2940 

GTCATCAACG TTGTTATTAT GTTTATCTTT GCCATTATTT TCTTTGGCAT CATQAACGAT 3000 

AGTGGTTTAT TCAAGCCGCT TGTCAAACGC TTAATATTAA TGACACGAGG CAATGTCGTC 3060 

ATTGTCTGTG CAATGACAGC TTTAATTGGC ACAATAGCCC AATTAGATGG GGCCGGTGCG 3120 

GTAACATTTT TGCTTTCTAT TCCTGCATTA TTACCTTTAT ATAAAGCGTT AAATATGAAT 3X80 

AAATATTTAT TGATTTTACT ATTAGCATTA AGCGCGGCX3A TTATCAACAT GGTACCTTGG 3240 

GGAGGTCCAA TGGCTCGTGT AGCTGCAGTG TTAAAAGCCA AAAGTGTCAA TGAATTATGG 3300 

TATCa?ATTAA TACCTATTCA AATAATAGGT TTCATTCTTG TTATGTTGTT TGCGGTATAT 3360 

^ CTrGSGATTTA AAGAACAGAA ACGTATCAAA AAAGCAATAG AGAGAAATGA ATTACCGCAA 3420 

ACACAAGATA TAGATGTACA TAAATTAGTT GAAGTATATG AACGAGATCA AGATCTAAGG 3480 

TTTCCTGTAA AAGGACGTGC AAGAACAAAA TCATGGATAA AATGGGTGAA TACAGCTTTA 354 0 

^ ACTTTAGCTG TTATTCTATC GATGTTAATA AATATTGCGC CACCTGAATT TGCATTCATG 3600 

ATAGGTGTTy CGTTGGCACT TGTTATTAAT TTTAAATCAG TGGATGAACA AATGGAACGA 3660 

TTAAGAGCgC ATGCGCCGAA TGCATTAATG ATGGCTGCAG TGATTATTGC AGCAGGTATG 3720 

TTTTTAGGTG TACTAAATGA AACCGGTATG CTTAAAGCGA TTGCGACCAA TTTAATCAAA 3780 

GTGATTCCTG CAGAAGTAGG ACCATACTTG CATATTATTG TAGGTTTACT TGGCGTACCA 3840 
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ACAGCAGGGC AATTTGGTGT ACCGTCTGTA TCAACAGCTT ATTCAATGGT CATAGGGAAT 3950 

ATTATAGGTA CATTTGTCAG CCCATTTTCA CCAGCCTTAT GGTTGGCAAT TGGTTTAGCA 4 020 

GAGGCAAACA TGGGCACGTA TATTAAGTAT GCATTCTTTT GGATTTGGGG ATTCGCTATC 4080 

GTTATGTTAG TAATTGCAAT GTTGATGGGC ATTGTGACGA TTTAAGTATG AAAAAATAGA 4140 

AACTATGGTC ACGTTGCAAA ATGAAATAAT AGTTGCATAA ACATGTCGAA ATGACGGACG 4200 

AATCTTTAAA CAATTTTAAA AATTAATGAA ATAATTGTGT AQAAATATGA ATTTCACTAA 4260 

ATGTTAATAA CTTTGTGACG TTTTAGTTAA CAGACTAATA AAAATTTOAA AATACTATAT 4320 

ATAGTGGTAT AACGTAATGA GTAGACACAA TATATAGGAA QAAGGGGTAA AATGAATCAA 4380 

ATCGAAGAAG CATTAACGGG TTTGATTTCT AAAGATCCTG CTATTGTTAA CGAAAATGCT 4440 

AACAAAGATA GTGATACATT TTCAACAATG AGA6ATTTAA CAGCAGGTAT CGTTTCTAAA 4500 

20 TCTTACGCAT TAAATCATTT ATTACCAAAG CACGTTGCAG ATGCACATCA AAGAGGGGAC 4560 

ATACATTTTC ACGACTTAQA TTATCATCCA TTCCAACCGT TAACTAACTG TTGTTTAATA 4620 

GATGCTAAAA ATATGCTACA TAATGGATTT GAAATAGGCA ACGCGAATGT AACTTCACCA 4690 

25 AAATCAATAC AAACTGCATC AGCX5CAGCTT GTACAAATTA TAGCCAATGT TTCTAGCAGT 4740 

CAATATGGTG GCTGTAcGGT TGACCgCGTT GACGAATTAC tTAGTACATA TGCACGACcA 4800 

TAATGAAGAA CAACATAGGA ATATsCGCAA AGCAATTTGT CAAAGAATCT GAAATTGATC 4860 

GTTATGTTGA TCAACAAGTC ACTAAAGACA TCAATGATGC GATTGAAAGT TTAGAATATG 4920 

AAATTAATAC CTTATATACA TCTAATGGAC AGACACCTTT TGTAACATTA GGATTCGGCT 4980 

TAGGTACAGA TCATTTAAGT CGCAAAATTC AACAAGCTAT CTTAAATACT CGTATCAAAG 5040 

GCTTAGGAAA AGACCGCACX5 ACAGCGATTT TCCCGAAACT TGTATTTTCA ATTAAAAAAG 5100 

GAAC<3UVCTT TAGTCCGCAA GATCCGAACT ATGACATTAA ACAACTAGCA TTAAAGTGTT 5160 

CAACGAAACG TATGTATCCA GATATTTTAA ATTATGACAA ACTCGTAGAA ATATTAGGTG 5220 

ATTTCAAAGC GCCAATGGGT TGTCGTTCAT TTTTACCAAG TTGGAAAGAT GCGGAAGGTC 5280 

ATTTTGAAAA TAATGGTCGT TGTAATCTTG GTGTTGTTAC ACTTAATTTA CCTAGAATGG 5340 

45 CATTAGAATC TGCCXSGTAAT ATGACGAAAT TCTGGGAAAT CTTTTATGAA CGTATCGATG 5400 

TGTTACATGA TGCATTACTT TATCGTATAA ATCGTTTGAA AGATGCTGTA CCGAATAACG 5460 

CACCGATTTT ATATAAAAGT GGCGCATTTA ACTATAAATT AAAAGAAACA GATGATGTTG 5520 

SO 

CTGAGTTATT TAAAAATAAA CGTGCAACGA TTTCAATGGG CTATATAGGG TTGTATGAAA 5580 

CAGCTACTGT TTTCTATGGT CCAGACTGGG AAACATCTCA AGAAGCAAAA GCATTTACGC 5640 
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GGTTCAGTAT TTmCAGTACG CCGAGTGAAT 
AGAGAGATTT GGAGATATTA AAGACATTAC 

5 

TTATGATGTA CGTAAAGATG TTACACCTTT 
TTATTATGCG AGTGGTGGTT TCATTCACTA 
GAAAGCACTA GAAGCGGTAT GGGACTACTC 

10 

TATTCCGATT GATCATTGTT ATGAATGTGA 
AGGATTTAAA TGCCCGAACT GTGGCAATGA 
AACATGTGGT TACCTAGGCA ATCCAGTTCA 
AATTTGCGCA CGAGTAAAAC ATATGAAA6C 
CAA6GACAAG GTTATATTGC TAAAATAGAA 
20 AGATGGAGTG TTTATGTATC AGGATGTCCA 

TCACAAAAGT TCAGATATGG CGAGAAATAC 
GATTGCGATC ATGATTATAT ATCTGGGCTA 
TTGGATATTA CATTAAATCT TGTCAAAGCA 
ATTTGGGTAT GGACTGGATT TTTATATGAA 
GAGTTATTAT CATACATTGA CGTTTTAGTA 

30 

CCTGATTTAC CATATAAAGG TTCTTTAAAT 
TCGCATGCGC GTATGATTGA ATATATAGTT 
ATTCGTTGCC TTGGCTTCTT TTTAGGTTAO 

3S 

CTTTATAAAA ATATATTGAT AGAATATGAC 
GTTGCATATC CGTTTTTTAA AAAA6TTGAA 

40 GTCTAACTTT TTGGTAGCGT TTTACAATAA 

CTTAAATGCC ATTCTAGTAA AATTTGTTAA 
TTTAGC6CTA TTAAGGTTTT GTTTATTACG 

^ CAAGTTTCAA ATTGTATGAA ATTTGCATTA 

GAATCAATAT AATTATTACA TTTTGAGATA 
TAGGGGGAGC ACAATTGAAA AAAGAGAAAG 

SO 

TAGCTGTACT TCTTTTTGCA GTTATACCTA 
TCATCACTGG TATTAATAGT GCCATTTCTG 

55 



CGCTAcGGAT 


CGTTTTTGTC 


GTTTAGACCA 


5760 


AGATAAAGGA 


TATTATCAAA 


ACTCTTTCCA 


5820 


TGAAAAGTTA 


GATTTTGAAA 


AAGATTATCC 


5880 


TTGTGAGTAT 


CCGAAATTGC 


AACACAATTT 


5940 


TTATGACAAA 


GTTGGTTACT 


TAGGTACAAA 


6000 


TTACGATGGA 


GATTTTGAAG 


CAACTGAAAA 


6060 


TAATCCTAAA 


ACAGTTGATG 


TCGTTAAACG 


6120 


ACGTCCAGTA 


ATTAAAGGCC 


GTCATAAAGA 


6180 


GCCTAAAGAA 


TGATACTTTT 


AGACATTAAA 


6240 


TCAAATAGCT 


TTGTTGACGG 


TGAAGGAGTA 


6300 


TTTAATTGTG 


TTGGATGTTA 


TAACAAAGCC 


6360 


ACTGATGAAA 


TATTAGCAGA 


AATATTAGAT 


6420 


AGTCTATTAG 


GTGGCGAACC 


ATTTTGTAAT 


6480 


TTTCGAGCAC 


GTTTTGGAAA 


TACAAAGACA 


6540 


TATTTAGCAA 


ATGATTGTAC 


AGAACGTCGA 


6600 


GATGGTCTAT 


TTATACAACA 


CTTATTCAAA 


6660 


CAACGCATTA 


TAGATGTACA 


ACAATCACTC 


6720 


AGTT6AATAT 


GTATTAGAAG 


TCAAGGTAAC 


6780 


GTACATAATT 


GAAAGTTAAT 


AAAAGCAATT 


6840 


CTAACAATCA 


TTTTGATACC 


AATACTAAAA 


6900 


AGAGAAAAGT 


GGTATTTTAG 


TGGGAAGGAA 


6960 


ATAAATATTC 


GTTAATAACG 


TATAAATATT 


7020 


ATTCGTTAAA 


TCGTAACTTA 


ACACTGTTAT 


7080 


GGAAAAATTA 


TATAAATATT 


CAATAATTGC 


7140 


TTATTAAATG 


TTAGTTATTG 


TCAA'i-ri'WX' 


7200 


AATCGAAACA 


GGATTCATAA 


AATTAATAAT 


7260 


TTATGGACTG 


GACGACCTTT 


ATAGGGACAG 


7320 


TGATGGCTTT 


TCCAAAAGCA 


AGTGAAGATA 


7330 


ATTCAATTGG 


TTCGATATAT 


TTATTTATGG 


7440 
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TTGGTAAAGC AAGTGATAAA CCAGAATTTA ATACATTTAC ATGGGCGGCA ATGCTGTTTT 7560 

GTGCAGGCAT AGGCTCTGAT ATTTTATACT GGGGCGTTAT TGAATGGGCT TTTTACTATC 7620 

5 

AAGTTCCACC AAATGGCGCG AAAAGTATGA GTGATGAAGC ACTCCAATAT GCGACGCAAT 7680 

ATGGTATGTT CCACTGGGGG CCAATTGCTT GGGCTATTTA TGTTCTACCA GCATTACCAA 7740 

TTGGTTATTT AGTATTTGTT AAAAAACAAC CGGTGTATAA AATTAGTCAA GCTTGTCGTC 7800 

10 

CGATTTTAAA AGGTCAAACA GATAAATTTG TAGGTAAAGT TGTAGATATC TTATTTATCT 7860 

TTGGATTGCT AGGTGGTGCG GCAACATCAC TAGCGTTAGG TGTGCCATTA ATTTCTGCAG 7920 

,5 GCATAGAAAG ATTAACTGGT TTAGATGGTA AAAATATGAT TTTACXfTTCG GCCATTTTAT 7980 

TAACAATCAC GGTTATATTT GCCATTAGTT CATATACAGG ATTGAAAAAA GGTATTCAAA 8040 

AGTTAAGTGA TATCAACGTT TGGCTATCCT TTGTACTTTT AGCCTTTATA TTTATTATTG 8100 

20 GACCGACTGT TTTTATTATG GAAACGACAG TGACAGGGTT CGGAAATATG TTGAGAGATT 8160 

TCTTTCATAT GGCAACATGG TTAGAACCAT TCGGTGGTAT TAAAGGTCGA AAAGAAACGA 8220 

ATTTCCCACA AGACTGGACA ATATTCTACT GGTCATGGTG GTTAGTATAT GCGCCATTTA 8280 

2S 

TCGGTTTATT TATCGCTAGA ATTTCAAAAG GTCGACGCCT TAAAGAAGTC GTGCTAGGAA 8340 

CAATTATTTA TGGAACGCTT GGATGCGTAT TATTCTTTGG TATTTTTGGT AACTATGCTG 84 00 

TGTATTTACA AATTTCTGGA CAGTTTAATG TAACACAATA TTTAAATACA CATGGTACAG 8460 

30 

AGGCAACCAT TATTGAAGTG GTGCATCATT TACCATTCCC ATCATTGATG ATTGTACTAT 8520 

TCTTAGTATC TGCTTTCTTA TTCTTAGCAA CAACATTTGA TTCGGGTTCA TATATTTTAG 8580 

CGGCAGCATC TCAGATU^AAA GTGGTAGGCG AACCATTACG TGCCAATCGT TTATTCTGGG 8640 

CATTTGCATT GTGCTTATTG CCATTTTCAT TGATGCTAGT TGGTGGTGAA CGTGCATTAG 8700 

AAGTATTGAA AACTGCTTCA ATACTGGCAA GTGTGCCATT AATTGTTATT TTTATTTTCA 8760 

40 TGATGATATC ATTTTTAATC ATTTTAGGGC GCGATAGAAT TAAACTTGAA ACGCGTGCTG 8820 

AAAAATTAAA AGAAGITGAA CGTCGTTCAT TGCGAATCGT TCAAGTATCa GAAGAAGAAC 8880 

AAGACGATAA TTTATAATTC AAAGCGGGTC TGGGACXSACG AAATGaATTT TGTGAAAATA 8940 

45 TCATTTCTGT TCCaTTCCCC TTTTTTTAGT AGCATTGTAG GATGAACTTT TAGGTTTTCA 900 0 

TTAATGTTGT ACTAAAAGAT TTAATTTTTT AGTGCTCCAA GTACTTATTT ATTGTATGAA 9060 

GCATATTCTA AATCGAAGTT TGAAAGACTC TCATTGATTA TTAAATTAAA TAAAGGGTAT 9120 

SO 

GCGTATGTAC AA1TCAAATT AATCGAAGGA TGAAATAAAA TGACTAATCA ATTTAAAAAT 9180 

AAACAGTCCA AATTACATGA CAGTTTAGAA TCCATCACAA AAAACTTATA TGCGACACCT 9240 
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20 



25 



30 
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ACAGAATATT GTTATCTATC ATTCCGGACA CTTAGGTGAC TCCCAACAAG ACATTGCATC 9360 

ATTAGGTGGT GTTTCAAAAG TATTGATGAA TCATGATCAT GAATCTATAG GAGGTTCTAA 9420 

TCAAGTTGAA GCCCCTTACT TTATACATGA AAATGATX3TG GCTGCACTGA AACATAAGAT 9480 

TTCTGTTCAA AAACAATTTA GTAATCGTGT AATGTTGGAT AAGGATTTAG AAGTTATTCC 9540 

CGCGCCTGGA CATACACCAG GGACGACACT AITTTTATGG GATGATGGTC ATCACCGTTA 9600 

CTTATTTACT GGAGATTTTA TATGTTTTGA AGGGAAGAGA TGGCGTACAG TTATATTAGG 9660 

TTCAAGTGAT AGAGAAAAAT CTATTCAAAG TTTAGAGATG GTTAAAGAAT TAGATTTTGA 9720 

TGTACTTGTA CCTTGGGTTA CTATCAAAGA TGAACCGTTA GTTTATTTTG TAGAAAATGA 9780 

ATATGAAAAA CX3TGAACAAA TACAAAATAT TATTGATAGA GTACGTGAGG GCGAGAATAG 9840 

CTAATTGAAA TATATTGGCG AAgCAATGTA ACGAATCTAA GAAAGCCCTA GAAAATACCT 9900 

CCATAATTGA TTGTCATATA AAACAAAAAC GGTAATTTCT ATTTATTGAG ATAGAAATTA 9960 

CCGTTTATTT CGTGGACCTA TTGCATTGTT TTTATCATGC ATAATCATCA TTGTCGTTGT 10020 

TTGAGTCAAT TTTAATTTTC AGAATCAGAA GGCTX3TTCTG GAATTGGGAA ATATTTGAAA 10080 

ATTTCACCGC TITCAATCGC TTCGGTTAAC TGTTCTAACC ATTCX3TAATA AACATGTGTA 10140 

TGATCAAGCT GAGCTTTAAT TTTTTGTGCC TCTTGTGTTT CAGCTTCAGT TAAATCACTG 10200 

CTTTCAAGTA ATGGATTGAT AATAGCTTGA GCATCTTTTA CTGCTTCGAC ATTGATGTCA 10260 

ATTTCACGCT GGAATTTTTT AGTGAAAAAG TTTCX3GAAAA AGATGAAAAA GTCTTTCTCG 10320 

GCGATAAAAT GTTGTTTGCG GCTTCCTCTC GTAAATTGTT GTTTAACAAT ATCAAATTCC 10380 

TGCAATTTCT TAACGCCAGC ACTCATACTT GGTTTGCTCA TTTGCAATTG ATGACGCATT 10440 

TCATCAAGCG TCATACTGCC TTCAAACACC ATTGTGCCAT ATAAGTTTCC TACACTTCTA 10500 

TTACTGCCAT ACAAATCCAT TGTCTGTCCA ATTGAATTAA TTACAATATC TTTTGCTTGT 10560 

TCTAATTGTT GCTGTTTGTT CTGAGAACGA GTCATCATTG CACCTCCGTA CATCATTTTG 10620 

GTCACGTTAA AATAAATACT AATACATTAT AAAACCTTTT CTAAAAAAAG ACATTAAAAA 10680 

TATTTAAAGC ATTAAAGTTA AATGTTTCGT TAAATAAAAA TCTAACGAAC TTACAAAACT 10740 

TAATTCTTGA GTTGTTTTGT AAATTGACAC ATTTTTCATT TCTATGCTAA CATAAGTnTG 10800 

TAAAATTcGT TAAATAAAAA TTTAACAAAC TTAACGGrGG TTGTTGAAkG GrACTTTTAA 10860 

aACATTTATC TCAGCGTCAA TATATTGATG GTGAGTGGGT TGAAAGCGCG AATAAAAATA 10920 

CAAGAGATAT TATCAATCCT TACAATCAAG AAGTGATATT TACX3GTTTCT GAAGGGACAA 10980 

AAGAGGATGC AGAACGTGCA ATCTTAGCTG CAAGACGTGC GTTTGAGTCT GGTGAATGGT 11040 
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AACATCgCGA AgCgTTAGCA CGATTAGAAA CATTAGATAC TGGAAAAACG TTAGAAGAAT 11160 

CATATGCAGA TATGGATGAT ATTCATAATG TGTTTATGTA TTTTGCTGGA TTAGCAGATA 11220 

5 

AAGACGGTGG CGAAATGATT GATTCACCAA TTCCAGATAC AGAAAGCAAA ATTGTTAAAG 11280 

AACCAGTAGG TGTAGTTACA CAAATTACAC CTTX3GAATTA TCCGTTATTA CAAGCATCAT 11340 

GGAAAATTGC GCCAGCGCTT GCTACGGGTT GTTCACTAGT TATGAAACCA AGTGAAATTA 11400 

10 

CACCATTAAC AACAATACGT GTTTTTGAAT TAATGGAAGA AGTTGGTTTC CCTAAAGGAA 11460 

CAATTAATCT TATTCTAGGT GCAGGTTCTG AAGTTGGTGA CGTAAT6TCA GGTCATAAAG 11520 

,5 AGGTTGACCT TGTATCATTT ACAGGTGGCA TTGAGACTGG TAAGCATATT ATGAAAAATG 11580 

CTGCTAATAA TGTTACGAAT ATTGCCTTGG AACTTCGCX^G TAAAAATCCA AACATTATCT 11640 

TTGATGATGC TGATTTTGAA TTGGCAGTAG ACCAAGCX5TT AAATGGTGGA TATTTCCATG 11700 

20 CAGGTCAAGT TTGTTCAGCA GGATCAAGAA TATTAGTACA AAACAGTATT AAAGACAAAT 11760 

TTGAGCAAGC ACTTATTGAT CGCGTGAAAA AAATCAAATT AGGTAATGGT TTTGATGCTG 11820 

ATACTGAAAT GGGACCAGTG ATTTCAACAG AACATCGTAA TAAGATCGAA TCTTATATGG 11880 

25 

ATGTAGcTAA AGCAGAAGGC GCAACAATTG CTGTTGGTGG TAAACGTCCA GATAGAGATG 1194 0 

ATTTAAAAGA TGGTCTATTC TTCGAGCCAA CAGTCATTAC AAATTGTGAT ACGTCAATGC 12000 

GTATTGTACA AGAAGAGGTT TTCGGACCTG TCGTTACTGT AGAAGGCTTT GAAACTGAAC 12060 

30 

AAGAAGCGAT TCAATTAGCG AATGATTCTA TATATGGTTT AGCAGGTGCT GTATTTTCTA 12120 

AAGATATTGG AAAAGCACAA CGCGTTGCTA ACAAGTTGAA ACTTGGAACG GTGTGGATTA 12180 

ATGATTTCCA TCCATATTTT GCACAAGCGC CATGGGGTGG ATACAAACAA TCAGGTATCG 12240 

GTAGAGAATT AGGCAAAGAA GGCTTAGAAG AGTACCTTGT TTCAAAACAC ATTTTAACAA 12300 

ATAaAATCC ACAATTAGTG AATTGGTTTA GCAAATAAAA ATTAQATAAG GTGAGTGCCA 12360 

40 TTGTAAGAAC ACAAGACACT CACTTTGTTT TGTATAAGTG GCGAAATGTT GATTGATAAT 12420 

TTGGACTAAA OGCAAAATGA ATCATAGATT ATTTCATTAC TGTTAGTAAC AATCGTAAAA 12480 

GGAAAAGCGA GTGTTTTGGT TAGCTAAGTT TAGCAATTCA ACGATAACCA ATCAGCCACT 12540 

^ AACAAATATT TCATGCAATA CTCACTTTGA AATACAACAA ACTTTGGAGG TCATAACGAT 12600 

GAGTAACAAA AACAAATCAT ATGATTATGT CATCATTGGA GGAGGCAGTG CAGGTTCTGT 12660 

ACTAGGTAAT CGTCTGAGTG AAGATAAAGA TAAAGAAGTC TTAGTATTAG AAGCGGGTCG 12720 

SO 

CAGTGATTAT TTTTGGGATT TATTTATCCA AATGCCTGCT GCGTTAATGT TCCCTTCAGG 12780 

CAATAAATTT TACGATTGGA TTTATTCAAC AGATGAAGAA CCACATATGG GCGGTCGTAA 12840 
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TCAACGTGGT AATCCAATGG ACTATGAAGG CTGGGCAGAA CCAGAAGGTA TGGAAACTTG 
GGATTTTGCG CACTGTTTAC CGTATTTTAA AAAATTAGAA AAAACATACG GTGCAGCGCC 
TTATGATAAA TTTAGAGGCC ATGATGGACC AATTAAGTTA AAACGAGGGC CAGCAACGAA 
TCCTTTATTC CAGTCATTCT TTGATGCAGG TGTTGAAGCA GGCTATCATA AAACACCTGA 
TGTGAATGGA TTTAGACAAG AAGGTTTTGG ACCGTTCGAT AGTCAAGTAC ATCGTGGTCG 
CCGAATGTCA GCTTCAAGAG CATATTTACA TCCA6CGATG AAGCXTTAAAA ACTTAACCGT 
TGAAACACGT GCCTTTGTAA CTGAAATTCA TTATGAAGGT AGAAGAGCAA CTGGTGTTAC 
GTATAAGAAA AATGGCAAAC TACATACCAT CGATGCTAAT GAAGTCATTT TGTCTGGTGG 
GGCATTCAAT ACGCCACAAT TACTACAATT ATCTGGTATC GGTGATTCA6 A6TTCCTAAA 
ATCAAAAGGC ATTGAGCCAC GTGTTCATTT ACCTGGTGTG GGTGAAAACT TTGAAGATCA 
CTTAGAGG 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7646 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 

(D) TOPOLOGY: linear 



12960 
13020 
130BO 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13508 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

GTAAGTATTG TCTTGATTTC CTAATAAAGT TATATCTTGT AATTCATCTT GTTQACGGCC 60 

ATGTGCCATA TAAAGCGCTC CTTTAAATTT AT TT TTTTAT TATTTTGGCG TCTCGGCGTG 120 

CTTTTTCAAA CATGTAATAA CTTGCACCGA TAATAACGAC GTAACCTAAT GTTGCATAGA 180 

AATCa^GGAGA TTCTCCGAAT AGAATAAATC CAAGTATTGC TGTGAAAATT ATAGATGCAT 240 

ACGTAAAAAT AGAAATATCT TTTGCTGCTG CAAAACTATA TGCTAAAGTA ACACCAATTT 300 

GACCCACAGC GGCAgCTAAG CCAGCCCCTA ATAGATAAAG TATTTGCATC TGACTCATTG 360 

GTTCATAAGT ATATGCAGTG AAAGGTATTA AAACGATGAC AGAAAATAAG GAGAAGTAAA 420 

ATACTATAGT ATATGGTGCT TyTCTTGTAC TAAGTGCTCG AACACATGTA TATGCTGATG 4 80 

CTGCAAAAAT ACCTGAGAAT AAGCCAGCTA ATGATGGAAT CATAGATGAT GAAAATTCAG 54 0 

GTTTCACTAT TAAnAGCAaC CTAAAATAGC AATTATCATT GCTGTAATTT GaTACTTCCT 600 

TACCTTTTCA TGtAAGAaaA CAATGCTTaA TAAAATCGTC CAGAAAGGAT TGAGTTTCAT 660 

TAATGAATCG GCATCACTAA GTACCATATG ATCAATGGCA TAAATATTTA ACAATACACC 720 
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TGGCTGATGG TATTTATATA TAAAAAATAA TGGAATAAAC ATTGCTACTA AGTTTCGTGC 840 

TAATGATTTT TGAAAAACAG GAAGGTCACC TGCAAGTCTG AAAAACACTG ACATAAAACT 900 

GAAACCAATA GCCGAAATTA AAATGGCAAT GATACCTTTT ACTTTAGGAT TCAATTTTAT 960 

CGCCTCTTTT ATATAAAATT AACGTATTTA TATTAGCATA AAACAACATG TTGTGCATAA 1020 

ATAGTTGAAA TTTACTATAA AAAGACTATA ATAGACTGTA GCGAACAAAC GTTCTGTGTT 1080 

TATTTGTCGG AATAATAGGG CATTACACTT TTATGAATGT TTGTGTTATT ACATAAAACA 1140 

AATATCAATT CAGTATCAAG CTAATAAGCT TTTTCTTGAT TTCTGTTGAT ACAATTGAGA 1200 

75 TTGACACAGA TTTAAAAAAA TCAAGTGATA TCTACTAAAA AATTTTTTTA AATTTGTTCA 1260 

AGTTTTTCTA ATTTAGTATT GGTGCCTAGT TGGAACGTTT TACGAACATT CGATTAGAAA 1320 

ATGGCACTTT AAATCATAGT GTGTCTTATG TATAATGAAA CACATAATAT AGTGTTGGTG 1380 

AAACGAAAAA gACACAATAT CTTGTGTTTT GTATGCAAAT GCTTTATTTA TGAAGAAATT 1440 

ACATTTAAAA 6TAATTTAAC ACAGAAATTT AATAGTTATT ATCAATTAAT AGTCATATTT 1500 

TTAGAAAATG TACTGAGCAA ATGGAAGATA TCCAATGATG TAAACACTAC ATATAGTGAT 1560 

TTTTATACAT TCAACCCATA TAAGCTACTA TTTTCTCAAA TATAAATCTA TGCAATTGGT 1620 

TTACATTTGA GAAAATAAGT AGCTTCATTA TAGTTAATAC AATGCTGAGA TAACCATAGT 1680 

AACCATGTTG TTAAAGCATT TTTTAATTGG AATQACTACT TTATTTAAAA GGGTTGAAGA 1740 

AAGAAGGTGA TCCAATGAAA ATAATATATT TTTCATTTAC TGGAAATGTC CGTCGTTTTA 1800 

TTAAGAGAAC AGAACTTGAA AATACGCTTG AGATTACAGC AGAAAATTGT ATGGAACCAG 1860 

TTCATGAACC GTTTATTATC GTTACTGGCA CTATTGGATT TGGAGAAGTA CCAGAACCCG 1920 

TTCAATCTTT TTTAGAAGTT AATCATCAAT ACATCAGAGG TGTGGCAGCT AGCGGTAATC 1980 

GAAMTGGGG ACTAAATTTC GCAAAAGCGG GTCGCACGAT ATCAGAAGAG TATAATGTCC 2040 

40 CTTTATTAAT GAAGTTTGAG TTACATGGAA AAAACAAAGA CGrTTATTGAA TTTAAGAACA 2100 

AGGTGGQTAA TTTTAATGAA AACCATGGAA GAGAAAAAGT ACAATCATAT TGAATTAAAT 2160 

AATGAGGTCA CTAAACGAaG AGAAGATGGA TTCTTTAGTT TAGAAAAAGA CCAAGAAGCT 2220 

TTAGTAGCTT ATTTAGAAGA AGTAAAAGAC AAAACAATCT TCTTCGACAC TGAAATCGAG 2280 

CGTTTACGTT ATTTAGTAGA CAACGATTTT TATTTCAATG TGTTTGATAT TTATAGTGAA 2340 

GCGGATCTAA TTGAAATCAC TGATTATGCA AAATCAATCC CGTTTAATTT TGCAAGTTAT 2400 

ATGTCAGCTA GTAAATTTTT CAAAGATTAC GCTTTGAAAA CAAATGATAA AAGTCAATAC 2460 

TTAGAAGACT ATAATCAACA CGTTGCCATT GTTGCTTTAT ACCTAGCAAA TGGTAATAAA 2520 
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ACATTTTTAA ACGCAGGCCG TGCGCGTCGT GGTGAGCTAG TGTCATGTTT CTTATTAGAA 2640 

GTGGATGACA GCTTAAATTC AATTAACTTT ATTGATTCAA CTGCAAAACA ATTAAGTAAA 2700 

5 

ATTGGGGGCG GCGTTGCAAT TAACTTATCT AAATTGCGTG CACGTGGTGA AGCAATTAAA 2760 

GGAATTAAAG GCGTAgCGAA AGGCGTTTTA CCTATTGCTA AGTCACTTGA AGGTGGCTTT 2820 

AGCTATGCAG ATCAACTTGG TCAACGCCCT GGTGCTGGTG CTGTGTACTT AAATATCTTC 2880 

10 

CATTATGATG TAGAAGAATT TTTAGATACT AAAAAAGTAA ATGCGGATGA A6ATTTACGT 2940 

TTATCTACAA TATCAACTGG TTTAATTGTT CCATCTAAAT TCTTCGATTT AOCTAAAGAA 3000 

,5 GGTAAGGACT TTTATATGTT TGCACCTCAT ACAGTTAAAG AAGAATATGG TGTGACATTA 3060 

GACGATATCG ATTTAGAAAA ATATTATGAT GACATGGTTG CAAACCCAAA TGTTGAGAAA 3120 

AAGAAAAAGA ATGCGCGTGA AATGTTGAAT TTAATTGCGC AAACACAATT ACAATCAGGT 3180 

20 TATCCATATT TAATGTTTAA AGATAATGCT AACAGAGTGC ATCCGAATTC AAACATTGGA 3240 

CAAATTAAAA TGAGTAACTT ATGTACGGAA ATTTTCCAAC TACAAGAAAC TTCAATTATT 3300 

AATGACTATG GTATTGAAGA CGAAATTAAA CGTGATATTT CTTGTAACTT GGGCTCATTA 3360 

AATATTGTTA ATGTAATGGA AAGCGGAAAA TTCAGAGATT CAGTTCACTC TGGTATGGAC 3420 

GCATTAACTG TTGTGAGTGA TGTAGCAAAT ATTCAAAATG CACCAGGAGT TAGAAAAGCT 3480 

AACAGTGAAT TACATTCAGT TGGTCTTGGT GTGATGAATT TACACGGTTA CCTAGCAAAA 3540 

30 

AATAAAATTG GTTATGAGTC AGAAGAAGCA AAAGATTTTG CAAATATCTT CTTTAT6ATG 3600 

ATGAATTTCT ACTCAATCGA ACGTTCAATG GAAATCGCTA AAGAGCGTGG TATCAAATAT 3660 

CAAGACTTTG AAAAGTCTGA TTATGCTAAT GGCAAATATT TCGAGTTCTA TACAACTCAA 3720 

35 

GAATTTGAAC CTCAATTCGA AAAAGTACGT GAATTATTCG ATGGTATGGC TATTCCTACT 3780 

TCTGAGGATT GGAAGAAACT ACAACAAGAT GTTGAACAAT ATGGTTTATA TCATGCATAT 3840 

40 AGAtTAGCAA TTGCTCCAAC ACAAAGTATT TCTTATGTTC AAAATGCAAC AAGTTCTGTA 3900 

ATGCCAATCG TTGACCAAAT TGAACGTCGT ACTTATGGTA ATGCGGAAAC ATTTTACCCT 3960 

ATGCCATTCT TATCACCACA AACAATGTGO TACTACAAAT CAGCATTCAA TACTGATCAG 4020 

ATGAAATTAA TCGATTTAAT TGCGACAATT CAAACGCATA TTGACCAAGG TATCTCAACG 4 080 

ATCCTTTATG TTAATTCTGA AATTTCTACA CGTGAGTTAG CAAGATTATA TGTATATGCG 414 0 

CACTATAAAG GATTAAAATC ACTTTACTAT ACTAGAAATA AATTATTAAG TGTAGAAGAA 4200 

SO 

TGTACAAGTT GTTCTATCTA ACAATTAAAT GTTGAAAATG ACAAACAGCT AATCATCTGG 4260 

TCTGAATTAG CAGATGATTA GACTGCTATG TCTGTATTTG TCAATTATTG AGTAACATTA 4320 
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ATGTTTTGGA GACAAAATAT ATCTCAAATG TGGGTTGAAA CAGAATTTAA AGTATCAAAA 444 0 

GACATTGCAA GTTGGAAGAC TTTATCTGAA GCTGAACAAG ACACATTTAA AAAAGCATTA 4500 

GCTGGTTTAA CAGGCTTAGA TACACATCAA GCAGATGATG GCATGCCTTT AGTTATGCTA 4560 

CATACGACTG ACTTAAGGAA AAAAGCAGTT TATTCATTTA TGGCGATGAT GGAGCAAATA 4620 

CACGCGAAAA GCTATTCACA TATTTTCACA ACACTATTAC CATCTAGTGA AaCAAACTAC 4680 

CTATTAGATG AATGGGTTTT AGAGGAACCC CATTTAAAAT ATAAATCTGA TAAAATTGTT 4740 

GCTAATTATC ACAAACTTTG GGGTAAAGAA GCTTCGATAT ACGACCAATA TATGGCCAGA 4800 

6TTACGAGTG TATTTTTAGA AACATTCITA TTCTTCTCAG GTTTCTATTA TCCACTATAT 4860 

CTTGCTGGTC AAGGGAAAAT GACGACATCA GGTGAAATCA TTCX5TAAAAT TCTTTTAGAT 4920 

GAATCTATTC ATGGTGTATT TACCGGTTTA GATGCACAGC ATTTACGAAA TGAACTATCT 4980 

20 GAAAGTGAGA AACAAAAAGC AGATCAAGAA ATGTATAAAT TGCTAAATGA CTTGTATTTA 5040 

AATGAAGAGT CATACACAAA AATGTTATAC GATGATCTTG GAATCACTGA AGATGTGCTA 5100 

AACTATGTTA AATATAATGG AAACAAAGCA CTTTCAAACT TAGGCTTTGa ACCTTATTTT 5160 

GAGGAACGTG AATTTAACCC AATCATTGAG AATGCCTTAG ATACAACAAC TAAAAACCAT 5220 

GACTTCTTCT CAGTAAAAGG TGATGGTTAT GTATTAGCAT TAAACGTAGA AGCATTACAA 5280 

GATGATGACT TTGTATTTGA CAACAAATAA CAATTAAATT AAAAGACCTT CACATGTAAA 5340 

GGGAAATAGC GATTCGTTTC GTCTTGTCTC CTACATGTTG AAGGTCTTTT TTTATGTGTA 5400 

TCTAACTCAT TATGAGTCTG AGTAAGAAAT CAATGCTCTA AGATGTACAA TGCTATTTAT 5460 

ATTGGCAGTA GTTGGCGGGG CCCCAACACA GAAGCAGGCG GAAAGTCAGC TAACAATATT 5520 

GTGCAAGTTG GCGGGGCCCC AACATAGAAG CAGGCGGAAA GTCAGCTAAC AATAATGTGC 5580 

AAGTTGGCGG GGCCCCAACA TAAAAGCAGG CGGAAAGTCA GCTAACAATA TTGTGCAAGT 5640 

TCGGgCGGGG CCCCAACATA AA6AAAAACT TTTTCCTTTA GAAATTATCA CTTCCaCaTG 5700 

AGTTTTACTC ATGTATTCCT ATTTTTAAGT ACACATTAGC TGAGGCTAAT GTTAAGAACC 5760 

ACTACTTAAT CAATCATTAG TAGTTTTTAT CATTTCCACT ATTCCCaGAC ATCaAAATCT 5820 

^ TAAGTGTTCT ATTTTACTTT AAGTAAACAA AATACACATT CCX^^AAAATT AAATTTCAGT 5880 

TTAATTGCAA ATATCAATAA AATTGACACT AAATTATTTG AAAGGCTATT GAAATTATGG 594 0 

TCAAAAAACG CTACTATTAA TGAGAAATAT TATCAATGAT AATGATTATC ATTAATTTAA 6000 

AGGGAGAAAA ATTTGTAATG AAGTATTTAT TAAAGGGAAA TATTTTGCTT CTATTACTAA 6060 

TATTGTTGAC AATTATTTCG TTGTTCATAG GTGTGAGTGA ACTATCAATT AAAGATTTAC 6120 
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GTATTTTAAT TGCTGGAAGT TCGTTGGCTT TAGCAGGCTT GATAATGCAA CAAATGATGC 6240 

AAAATAAGTT TGTTAGTCCG ACTACAGCTG GAACGATGGA ATGGGCTAAA CTAGGTATTT 6300 

5 

TAATTGCTTT ATTGTTCTTT CCAACCGGTC ATATTTTATT AAAACTAGTA TTTGCTGTTA 6360 

TTTGCAGTAT TTGCGGTACG TTTTTATTTG TTAAAATCAT TGATnTATA AAAGTGAAAG 6420 

ATGTCATTTT TGTACCX5CTT TTAGGAATTA TGATGGGTGG GATTGTTGCA AGTTcACAAC 6480 

70 

CTTCATCTCA TTGCGCACGA ATGCTGTTCA AAGCATTGGT AACTGGCTTA ACX;GGAACTT 654 0 

TGCCATTATC ACAAGTGGAC GCTATOAAAT TTTATATTTA AGTATTCCTC TTTTAGCATT 6600 

,5 GACATATCTT TTTGCTAATC ATTTCAC3GAT TGTAGGAATG GGTAAAGACT TTACTAATAA 6660 

TTTAGGTTTG AGTTACGAAA AATTAATTAA CATCGCATTG TTTATTACPG CAACTATTAC 6720 

AGCATTGGTA GTGGTGACTG TTGGAACATT ACXX3TTCTTA GGACTAGTAA TACCAAATAT 6780 

20 TATTTCAATT TATCGAGGTG ATCATTTGAA AAATGCTATC CCTCATACX3A TGATGTTAGG 6840 

TGCCATCTTT GTATTATTTT CTGATATAGT TGGCAGAATT GTTGTTTATC CATATGAAAT 6900 

AAATATTGGT TTAACAATAG GTGTATTTGG AACAATCATT TTCCTTATCT TGCTTATGAA 6960 

AGGTAGGAAA AATTATGCGC aACAATAATA AAAAAATAAT GCTTTTAATT GCAGTAACGT 7020 

TATTAATTAG TATGCTGTAC TTATTTGTAG GTATTGATTT TGAAATATTT GAATATCAAT 7080 

TTTCAAGTCG TTTAAGAAAG TTCATATTAA TTATTTTAGT AGGTGCTGCC ATTGCAACTT 7140 

30 

CAGTGGTGAT TTTTCAAGCG ATTACAAATA ACCGTCTATT GACACCATCA ATAATGGGGT 7200 

TAGATGCAGT TTATrTATTT ATCAAAGTAT TGCCAGTCTT TTTATTTGGA ATTCAATCGG 7260 

TATGGGTTAC TAATGTATAT TTGAACTTTA TATTAACACT TATAACGATG GTGTTATTCG 7320 

35 

CACTAATCCT ATTCCAAGGT ATCTTTAAAA TCGGACATTT TTCAATTTAT TTTATCTTAC 7380 

TTAJTGGTGT CCTTTTAGGA ACATTTTTTA GAAGCATAAC AGGTTTTATT CAACTGATTA 7440 

40 TGGATCCTGA GTCATTTTTA GCAATACAAA GTAGTATGTT TGCTAATTTT AATGCTTCTA 7500 

ATTCGAATTT AGTTACTTTC TCAGCAGTGC TATTAGTAAT CTTATTAGTC ATTACAATTT 7560 

TACTATTGCC TTATTTAGAT GTATTGCTTT TAGGTCGTGC TGAAGCAATT AATCTTGGGA 7620 

^ TATCGTATGA AAAATTAACG CGAATT 7646 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
SO (A) LENGTH: 1194 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

ATGAATATAT TTnnAAATAA ATTATTATGG ATTGCACCAA TnGCCACTAT GATTATCTTG 60 

5 

GTAATCTTTT CTTTAGCTTT TTATCCTGCA TATAATCCTA AACCAAAAGA TTTACCAATT 120 

GGTATATTAA ACGAGGATAA AGGTACAACG ATTCAAGATA AAAATGTTAA CATTGGTAAA IBO 

AAATTAGAGG ATAAATTATT AGATAGTGAT TCTAATAAAA TTAAATGGGT TAAGGTTGAT 240 

10 

AGTGAAAAAG ACCTTGAAAA AGATTTGAAA GATCAAAAAA TCTTTGGAGT AGCTATTATT 300 

GATAAAGACT TTTCAAAAGA TGCTATGAGT AAAACACAAA AAGTAGTTAT GGATAGTAAA 360 

AAAGAAGAAA TGCAACAAAA AGTTGCTTCA GGTGAAATTC CGCCACAAGT GGTTCAACAA 420 

ATGAAACAAA AAATGGGGAA TCAACAAGTA GAGGTTAAGC AGGCTAAATT TAAAACGATT 480 

GTAA6TGAAG GATCAAGCTT ACAAGGTTCA CAAATTGCAT CAGCTGTGTT AACTGGTATG 540 

20 GGTGATAATA TTAATGCTCA AATTACGAAG CAAAGTTTGG AAACATTAAC GAGTCAAAAT 600 

6TTAAAGTCA ATGCCGCGGA CATCAATGGT TTGACGAATC CAGTAAAAGT GGATAATGAA 660 

AAACTTAATA AAGTTAAAGA TCACCAAGCA GGTGGTAATG CACCATTCCT AATGTTTATG 720 

CCAATTTGGA TAGGTTCAAT CGTAACGTCT ATCTTATTGT TCTTTGCATT TAGAACTAGT 780 

AACAATATCG TCGTGCAACA TCGTATCaTT GCtTCAATTG GACAGATGAT ATTTGCAGTT 84 0 

GTTGCAGCAT TTGCAGGTAG CTTTGTTTAT ATTTATTTCA TGCAAGGCGT TCAAAGATTT 90 0 

30 

GATTTTGACC ATCCAAATCG TATCGCAATT TTTGTAGCAT TTGCGATTCT TGGTTTCGTG 960 

GGCCTTATTT TAGGTGTTAT GGTATGGCTA GGTATGAAGT CAGTTCCAAT TTTCTTCATT 1020 

TTAATGTTCT TTAGTATGCA ACTTGTAACG TTACCTAAAC AAATGTTGCC TGAAAGTTAT 1080 

35 

CAAAAATATG TATATGATTG GAATCCATTC ACACACTATG CAACAAGTGT AAGAGAcTAT 1140 

TATACSTTGAA TCATCATATT GAATTAAATA GTACAATGTG GATGTTTATA GGGT 1194 
4^ (2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 558 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 
^ (D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

SO 

GACCGACCTA TACATCCGTA TAAGTATTTC TTGATATAAG TCTTCTAAAT CATAATGATT 60 
AAATCCAAAT GTTTTGATGC GTCGAATAAT TAATGGTTGT AGATCCATTA CTAACTTTTC 120 
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GTATTTGAAA TATTAAACTA ACCCCTTCTA TCTAAAATTT AAGGTTAGTT TAATATTGTT 240 
ACATTCAAAA TTTCAAGATG ACGGAAATGT CATTTCTTAT GATGTCCTCT TCGTATTTTT 300 
^ TCAAATTCTG CAAGGATTTC AGAAGATAAC GGAATTCX5AG TTCTTGGCTT GTTTTCACTT 360 

ATATCATCTA ATGATTTACT CACATCAATT TCATTTTCTT TTAAATCTCT CCACATTTCG 420 
CGAGATGATA TTCTATATGC ACCTGATCCA AAGATAGCAT GTTGcTCACT CaTATCACTT 4 80 

10 

GTTACAACTG TAATATGcTT AGtATGCTTG tCaTAAAGtT CaTAAACCAT AACGGTTCTA 540 
ATGGAAACCA ATCAGCTG 558 
(2) INFORMATION FOR SEQ ID NO: 124: 

IS 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7762 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
«o (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



2S 


GCTTCAGACA 


TnTGATGATA 


TAATCTCTCA 


TCATCGATTA 


ATTCTTTTGC AGCTTGATAC 


60 




ACRTnTTGCT 


TATTTGTTCC 


AATGACTTTT 


AATGTGCCAG 


CTTCAACACC 


TTCAGGACGT 


120 




TCTGTAACAC 


TTCGCCAAAA 


CTAAAACTGG 


CTTATTAAAT 


GATGGCGCTT 


CTTCCTGAAT 


180 


30 


TCCACCTGAA 


TCTGTCAAAA 


TAAAATAAGA 


TTTTnTAGCA 


AAATTATGGA 


AATCTATACG 


240 




TCCAAAGGTT 


CAATCAATTC 


AATTCTGTCA 


TGACTACCTA 


AAATCTTTTG 


AGCCACCTCT 


300 


35 


CGAACTTTCG 


GGTTTTTATG 


CATTGGATAT 


ACCAGTGCTA 


AATCAGTATA 


CTCATCTATT 


360 


AAGCGTCTAA 


CCGCTTTAAA 


TATATTTTCC 


aTGGGTTTCC 


CGATATTTTC 


TCGTCGGTGT 


420 




GCTGTCATrA 


GAATGAATTT 


JcTtGTCATGG 


TATTTATCCA 


TGATGTTAGA TTTATAATTG 


480 


40 


TCATCAACTG 


TATATTTCAT 


AGCATCAATC 


GCAGTATTAC 


CAGTGACAAC 


AACACTTTCT 


540 




GAATATTTCC 


CTTCACTTAA 


CAAATGCGAT 


GCAGCATTTT 


TAGTAGGTGC 


AAAATGTAAG 


600 




TCAGCTAATA 


CACCAACTAA 


TTGTCTATTC 


ACCTCTTCTG 


GAAAAGGTGA ATATTTATCA 


660 


45 


TAACTTCTAA 


GCCCTGCTTC 


AACGTGTCCA 


ATCGGCACTT 


GGTTATAAAA 


TGCCGCTAAA 


720 




CCACCTGCAA 


ATGTCGTCAT 


CGTATCACCA 


TGTACAAGTA 


CCATGTCTGG 


TTTTTCTAAT 


780 




TGAATCACTT 


GTTCTAATTG 


AGT6ATTGAT 


TTAGAAGTTA 


TCTCAGAAAG 


TGTCTGTCCT 


840 


SO 


GATTTCATAA 


TATTCAAATC 


GTATTTTGGT 


TTGATTTCAA 


AGGTACTTAA 


TACTGAATCA 


900 




AGCATTTCTC 


TATGCTGTGC 


TGTAACAACA 


ACAATTGGCT 


CGAGCATTTT 


TTCTTGTTCC 


960 
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ATCTTTTTCA TCAAACTACT TATCTCCGAT TCTTCTATTT AGTACCAAAC AATCTATCTC 1080 

CAGCXSTCGCC TAACCCTGGT GTGATATATG CTTTGTCATT aGCTTTTCAT CAAGTGCAGC H40 

^ AATATAAATA TCTACATCTG GATGTGCTTC ATGCATCTTT TCTACGCCTT CTGGTGCTGC X200 

AATTAAACAC ATGAAGCGAA TATTTTTAGC GCCACGTTTC TTCAATGAAG TAATAGCTTC 1260 

AATTGCTGAT GCGCCTGTTG CTAACATAGG ATCAACAACA ATGATTTGTC TTTCAGTAAT 1320 

10 

ATCTTGAGGT AACTTAGCAA AATACTCTAC AGCCTTTAAT GTTTCGGGAT CTCGATATAA 1380 

ACCGATATGT CCAACTCTGG CTGCAGGTAC TAAACTTAAA ATACCATCAG TCATACCTAA 1440 

ACCAGCTCTT AAAATTGGAA CGATAGCTAA TTTTTTACCA 6CTAATCGTT TAGCCGTCAT 1500 

IS 

TTTAGTTACA GGCGTTTCAA TATCAACATC CTGAAGCTCT AAGTCTCTAG TTACTTCATA 1560 

TGCCATCAAC ATACCAACTT CGTCTACAAG TTCTCTAAAT TCTTTAGTAC CTGTATTTAC 1620 

2^ ATCTCTAATA TAGCTTAGTT TGTGTTQAAT TAATGGATGA T0GAAAACX3T GTACTTTACT 1680 

CATAAAAATT ACTCCTATCT TTGTGTATGT TTATTGATAT AGAGGATATT CAGCTGTTAA 1740 

TTTCGCAACG CGTTCTTTAG CTTGTTGTAA TTTTTCTTCA TCTTTACTAT TTTTCAATGC 1800 

25 TAAACTGATG ATTTTTGCAA CTTCCTCAAA AGCTTTTTCA TCAAATCCAC GOGTTGTTGC 1860 

AGCAGGTGTA CCTAAACGTA TACCACTCGT TACAAAAGGT TTTTCTTGAT CGAACGGAAT 1920 

GGTATTTTTG TTACATGTGA TACCAACTGA ATCTAAAGTC TCTTCAGCTT CTTTACCAGT 1980 

^ AAGTCCTATA GACCCTTTTA CATCAACAGC TACTAAGTGA TTATCTGTAC CGCCAGAAAC 2040 

AATTCTAAAT CCTTCATTAA TTAATGCTTC TGCAAGAACT TTTGCGTTTT TAACCACTTG 2100 

TTGTTGATAC GTTTTGAAAT TATTTTCTAA CGCTTCTCCA AAAGCAACTG CTTTtGCTgC 2160 

35 

AATAACATGC TCAAGAGGTC CACCTTGAAT ACCAGGGAAA ATTGTTTTAT CTATGTCTTT 2220 

TTTATATTCT TCCTTACATA AAATCATACC ACCACGtGGT CCGcGTAATG TTTTGTGTGT 2280 

TGTAGTTGTT ACAAAATCAG CATATTCTAC TGGATTTGGA TGTAAACCTG CCX3CTACTAA 2340 

40 

TCCTGCAATA TGTGCCATGT CTACCATTAA CTTAGCGTTT ACTTCATCTG CGATTTCTTT 2400 

AAACTTTTTG AAGTCAATTG TTCTTGAATA TGCTGATGCT CCTGCCACAA TAAGCTTAGG 2460 

^ CTTATGCTCT AACGCTAATT TACGAACTTC ATCATAATTG ATTCGTTCTG TGTCTTTATC 2520 

TACTCCATAT TCAACGAAAT TGTAGAATTT ACCACTAAAA TTAACAGGCG CTCCATGTGT 2580 

CAAGTGACCA CCATGACTCA AATTCATACC TAAAACTGTG TCGCCCATTT CTAATGCAAC 2640 

50 TAAGTAAACA GCCATGTTCG CTTGTGAACC TGAATGTGGT TGAACATTGA CATGTTCAGC 2700 

TCCAAACAAT GCTTTAGCAC GATCAATTGC GATGCTTTCA GTAACATCTA CAAACTCACA 2760 
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TTGTGCTTCC ATAACCGCTT CCGATACAAA ATTTTCCGAT GCGATTAACT 


CTATGTTGCT 


2880 




ATTTTGTCTC TGAAATTCTC TCTCGATTGC TTCTGCGATA ACTTTATCTT GCTTGGTGAT 


2940 


5 


ATAAGACATA AAATCTCCCC TTCTTTCAAA AAAACTTATT GGTATTTAGC ACGTTCQCCA 


3000 




CCAATCTTTT TCGGCCTAGA TGTGGCAATA GTTACAATTG CCTGTCCTAC 


TTGCTTTACT 


3060 




GAGGTCCTTA CAGGTACACA TACATGTTTA ATATGCATGC CTATTAACGT 


TTGACCAATA 


3120 


10 


TCAATTCCAC AAGGAACAGT AATATGTTCG ACCACGATCG GATCCTTCAT 


ATGCTGAAAA 


3180 




GCGTATGTTG CCAAACTCCC TCCAGCATGT ACATCTGGAA CGACGGAAAC 


TTCTTCCATT 


3240 


IS 


GTTAATGGAT TATACTGAGA TTTTTCTATT GTTATCGCTC TGTTGATATG 


TTCACATCCT 


3300 


TGAAAAGCAA AAGTAACGCC TGTCTCTTTA CTCACAACAT CTAATGCATT AAAAATAGTT 


3360 




TCTGCAACTT CCaTCGAACC GACAGTCCCT ATTTTTTCGC CAATGACTTC 


CX5ATGTTGAA 


3420 


20 


CATCCAATTA AACATATATC TCCTTTATTA AAAAAGGACA TATCTTTTAA 


TTCGTCTAAT 


3480 




AACATTGTCA AATCTTTCAT AAAAGCCCAC CCTTCXTTAAA AATAAAAAAG 


GAATATAGCA 


3540 




AAGTGCTACA CTCCTCTATT ATAACTTATT TAACTGTTAA CATATACTAA 


TTATACAGAA 


3600 


25 


TTCCTACTAG CAAATAATAT CTTTTAATTT TAAAATTAAA CTTACAAGTT 


CTTCATAGGT 


3660 




ATGTACATAC ATTTCTTTTG TTCCACCGTA TGGATCTATA ACTTCTCCTG 


CTTCTTTtAC 


3720 




ATATTCATGC AATGTGAAAA CATGATTTTG CAAACCAAAG TGTGCCTCTA 


TTAATTCTTT 


3780 


30 


GTGCGAATAC GACATCGTCA AAATAATATC TGCTTTCAAA TCTGCTTCAG 


TAAATTGTTG 


3840 




CGATAAGGTC GTTTCAGCTA AATGATGTTC TTCAACTAAG TCTTCAACAT 


AATTCGAAAC 


3900 




ACCTTGATTG TTCACAGCGA ATATACCTCT TGATTCAAAT TGATGATTTG 


GCATAACCTC 


3960 


35 


TTTTGCAATA CTTTCCGCTA ATGGGCTACG ACATGTGTTA CCTGTACAAA 


CGAATAAAAT 


4020 




CTT»TAGTT CACATCCTTT AATAATGTGA TTACCTGCAG CTTTTAACAT 


GCGATTCATA 


4080 


40 


ATTGCTTCTG TATTATCATT CAGCTCAAAG CCGTATATAT ACX5CCGCTGA 


AATATTTTCA 


4140 


TTTTCATCAA GTGAATGTAA CACATCATAA AGATTATGAC TTGCTTGTTT AACATCATTG 


4200 




TCATCCTGAC ATAATTGAAT GAATTGCGCT TCACTTGGTA TAAACGCCAC 


CTTATTACTC 


4260 


45 


GGCACAATAA AAGCTATAGA AGACCAATCT TTACCGTCAT TTCCAATTTT GCTCTCAATA 


4320 




TCTGTAATAA TTGTAAGTGG TGTATTGGGT GAGTAATGCT TATACTTCAT 


ACCTGGTGCA 


4380 




ATTGGCTGTT CAGTATCATT ATAATCAGCA TGGGCGATAC TATTCGGAAG 


TATTTCTGTA 


4440 


SO 


ATCATTGCTG CTGTTATAGA ACCAGGTCTT GCAATTTTAT AAGGAAAAGA 


TGTGCAATCT 


4500 




AAAACCGTAC TTTCTAATCC TTCTTCACTT TGTTCAGCTT GAACAATACC ATCGATACGG 


4560 
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GCACTTGGAG CAGCTAGAGG TTCATTTATG ATTTGTAATA ATTGTCTACC TACAGAATGG 4680 

CTTGGCATTC TAACAGCAAC TGATGATAAA CCTCCAGAAA CTTTTCGACA TAGATAGCCT 4740 

5 AGCTTTAACG GCAATATAAA CGAAATAGGG CCCGGCCAGA ATGCCTGCAT TAACTTTTCT 4800 

AOGCGTGGAT CCAAAGTATA TGTAAAATCT TTTAATTGAC CTTTACTGTG TATATGAACA 4860 

ATAAGCGGAT TGTCAGATGG ACGGCCTTTA GCTTCATATA TTTTAGCTAC AGCTTCTTCA 4920 

TCTGTCX5CAT TTGCTGCAAG TCCATAAACT GTTTCAGTTG GTAAACCTAT TAAACCACCG 4 980 

TTTAAAACAA TGTCTTTTAT TTCATTAATT TTAGGATATT GCTGTAAATC TTCATTATAT 5040 

TCTCTAACAT CCCAAATTTT AGTATCCAAC TTAATCACGC CTTTCTTATT TATCATAATA 5100 

75 

TAAAGCAAAA AGCTATGCAC TTAACTAATC ATAGCAAAGG CATAACTTCT AATTACCATT 5160 

TAAATGAGAC GATTCGATCX3 TGGCCATTTA TATCTTTAAT AATGTCGATT rrTTTGTCAG 5220 

GAAATTTATT TAAAATTATT GATTTAAGTQ CCTCACCTTG ATTGTAACCA ATTTCAAAAA 5280 

20 

CAACTGGGCT GCCTTTTTCC ATAACGTGAG GTAAATCTTC AATGATTGAT TCATAAATAG 5340 

CATATCCATG GTTATCTGCA AACAATGCCT GATGTGGTTC GAATCTCGTA ACCGTTGGAG 5400 

2^ ACATCGTAAC CATATCTTTT TCATCTATAT ATGGTGGATT AGATATCAAG CCX5TTCAACT 5460 

TGATACCTTC ATTAATTAAG GGCTTTAATG CATCCCCTGT TAAAAATTGT ATTTGTGATT 5520 

GATGCTTCTC AGCATTATTA CGAGCCATAT TCATTGCTTC AAGTGAAATA TCAGTAGCAA 5580 

30 TAACATTTAA ATCCGGCTTT TCACATTTCA AAGTAATTGC AAGTACACCA CTACCCGTTC 5640 

CGATATCTAC GATTGTTGCA TCATCTTCTA ACTGTTGTAA GAAATGCAAC ATTACTTCTT 5700 

CAGTTTCAGG TCTTGGTATC AAACAATTTG AGTTTACATC AAACGTTCTA CCATAAAATG 5760 

^ AGGCAAAGCC AACTATATAC TGTATAGGCT CTCCTAATAA CATACGTTGT AATGCTAAGT 5820 

CGAACTTCAT AATCATCGCT TTCGGCATAT CATCATGCAT GTGGACTACA AAGTCCGTAC 5880 

GCGTCCATTG AAATACATCT AACATTAACC ATTCAGCTCG TGTTTGTTCA AACCCTTTTT 5940 

40 

GTTGTGTTAA ATGAATTGCT TCATCTAACT TTTCTTTATA ATTCACCATT ATTAAGTTCT 6000 

TTCAATTTAT CTGTCTGCTC TGATAAAGTC AGTGCATCTA TAATTTCTTC TAAATGGCCT 6060 

TCCATAATTT GCCCTAATTT TTGAAGCGTT AGACCTATAC GATGGTCTGT TACACGGCTT 6120 

4S 

TGTGGATAAT TATAAGTTCG AATACGTTCT GAACGATCAC CAGTACCGAC TGCTGATTTA 6180 

CGTTGTGACO CATACTTTTG TTGTTCTTCT TGAACTTTCA TATCGTATAA ACGTGCTTTT 6240 

5^ AACACTTTCA TTGCTTTTTC ACGGTTTTGA ATTTGAGACT TCTcAGAAGA TGTTGCAATG 6300 

ACACCAGTTG 6TAAATGGGT AATACGTACT GCAGAGTCAG TTGTGTTTAC GTGCTGACCA 6360 

55 



688 



10 



IS 



20 



EP 0 786 519 A2 

ACATCTTCAA CTTCTGGTAA AACTGCCACT GTAGCTGTTG AAGTATGAAT ACGTCCACCT 6480 

GATTCTGTTT CAGGCACACG TTGAACGCGG TGCGCACCAT TTTCAAATTT CAATTTACTA 6540 

TACGCX5CCAT TACCAGAAAC TGAGAAACTA ATTTCTTTGT AACCACCATG GTCACTTTCA 6600 

GACGCTTCTA CTATTTCAGT TTTGAATCCT TGTGATTCAG CATACTTTGA ATACATACGC 6660 

ATTAAATCAC CAGCAAAAAT CGCAGCCTCA TCACCACCTG CTGCTGCTCT TATTTCTACA 6720 

ATAACGTCTT TGTCATCATT AGGATCTTTA GGAATCAATA ATATTTTAAG CTCTTCTTCA 67 BO 

AGATTTGGAA GTTCAGCTTT AATACCATTA CTCTCCTCTT TTAACATTTC TACTTCTTCT 6840 

TTATC31TCAG TCTCACTTAA CATTTCTTCA ATATCAGCTA ATTCTTCTTT TTTAGCTTTA 6900 

TAGTTACGAT AAACATCTAC AGTTTTTTGT AAATCAGCTT GCTCITTAGA ATATTTACGT 6960 

AATTTATCTG AATCATTTAC AACATCTGGG TCACTTAACA GTTCATTTAA CTGTTC6TAT 7020 

CTTTCTTCTA CAATATCTAA TTGATCAAAC ACTTATAATT CCTCCTTATT ATTATCACTA 7080 

GGTGCTACGA TATGGTGCGC GCGACAACX?r GGCTCATAAC TTTCATTGGC ACCTACTAAG 7140 

ATAATCGGAT CATCGATTTT AGCTGGTTTA CCATTTATTA ATCGTTGCGT TCTACTAGAT 7200 

25 GAAGAACCAC AAACAGCACA AACTGCTTGA AGTTTCGTTA CTTGTTCACT GACAGCCATC 7260 

AATTTAGGCA TTGGTTCGAA CGGTTCGCCC CTAAAATCCA TATCTAATCC AGCAACAATA 7320 

ACACGGTGTC CATCTGCTGA TAGTTTTTCT ACTATACTTA CAATTTCATC GTCAAAAAAT 7380 

TGCAcTTCGT CTATTCCTAT AACATCAACA TTAGTTAAGT CGTGCGTCAT AATTTCACTT 7440 

GCTTTAGAAA TATTAATCGC TTCAATGGCA TTACCATTAT GAGAQACCAC TTTTTCTTTA 7500 

TGATATCGAT CATCAATCGC CGGTTTAAAT ACAACGACTT TTTGTTTAGC GTATATACCC 7560 

CTTCTTAGAC GTCTTATTAG TTCTTCGGAT TTACCGCTAA ACATACTACC TGTAATACAT 7620 

TCTATCCAAC CGGAATGGTA AGTTTCATAC ATTGAGAGTn CCACCTTTTT CAAAACATAA 7680 

TCGCTTTATT ATATCATATT TCAAATATTC ATAAATGTCT TTnTCATAAT TATATCGATA 7740 

TTGTACATGA ACAATTATTT TA 7762 
(2) INFORMATION FOR SEQ ID NO: 125: 



30 



35 



40 



45 



SO 



55 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2583 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: dOllble 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
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TAAAAAAATT ATTATCAATG ATGAACTAGA ATTGACTGAA TTCCACCAAG AACTTACTTA 120 

TATTTTAGAC AACATAnAAG GGAATAATAA TTATGGTAAG GAATTTGTTG CAACCGTTGA 180 

5 AGAAACATTC GACATTGAAT AaAGCGGGGT GgaAGCACTA TGAATCAATG GGATCAGTTC 240 

TTAACACCTT ATAAGCAAGC GGTTGATGAG TTGAAAGkGA AcTTaAAGGC ATGCGCAAAC 3 00 

AATATGAAGT TGGTGAACAA GCGTCGCCAA TAGAATTTGT TACTGGTCGT GTTAAACCAA 360 

TO 

TCGCTAGTAT TATAGATAAG GCAAACAAAC GACAAATACC ATTTGATAGG TTAAGAGAAG 42 0 

AAATGTACGA TATCGCTGGT TTAAGAATGA TGTGCCAATT TGTTGAAGAT ATTGATGTTG 480 

TCGTCAATAT TTTAAGACAA AGAmAAGATT TTAAAGTAAT TGAAGAACGA GATTATATTC 540 

IS 

GTAACACTAA AGAAAGTGGT TACCGCTCGT ATCATGTCAT TATTGAATAT CCAATTGAAA 600 

CATTACAAGG CCAAAAATTT ATATTGGCTG AGATTCAGAT TCGTACATTA GCAATGAATT 660 

TCTGGGCAAC GATTGAACAT ACTTTACGAT ATAAATATGA TGGTGCTTAT CCGGATGAAA 720 

TTCAACATCG TTTGGAAAGA GCGGCAQAAG CAGCGTATTT ACTTGATGAA GAGATGTCTG 780 

AAATTAAAGA TGAAATTCAG GAAGCTCAAA AATATTACAC GCAAAAACGT TCTAAAAAAC 840 

2s ATGAAAATGA TTAACGAGGT GTTATAAATC ATGCGTTATA CAATTTTAAC TAAAGGTGAC 900 

TCCAAGTCTA ATGCCTTAAA GCATAAAATG ATGAACTATA TGAAAGrTTT TcGCATGaTT 960 

GaGGATrGTG AAAaTCCTGA AATTGTTATT yCAGTTGGTG GTGACGGTAC ATTACTACAA 1020 

30 GCATTCCATC AGTATAGCCA CATGTTATCA AAAGTGGCAT TTGTTGGAGT TCATACAGGT 1080 

CATTTAGGAT TTTATGCGGA TTGGTTACCT CATGAAGTTG AAAAATTAAT CATCGAAATT 114 0 

AATAATTCAG AGTTTCAGGT CATTGAATAT CCATTGCTTG AAATTATTAT GAGATACAAC 1200 

35 

GACAACGGCT ATGAAACAAG GTATTTAGCA TTAAATGAAG CAACGATGAA AACTGAAAAT 1260 

GGCTCAACAC TTGTTGTGGA TGTTAACTTA AGAGGGAAAC ACTTTGAGCG ATTTAGAGGC 1320 

6ATGGATTAT GTGTATCAAC ACCTTCXK^GT TCAACGGCTT ATAACAAAGC GCTAGGTGGC 1380 

40 

GCACTGATAC ATCCTTCACT TGAAGCAATG CAAATTACAG AAATTGCCTC GATAAATAAT 1440 

CGTGTGTTTA GAACGGTAGG ATCACCACTT GTATTACCAA AGCATCATAC ATGTTTAATA 1500 

TCACCAGTTA ATCAT6ATAC CATTAGAATG ACGATAGATC ATGTTAGTAT CAAACATAAA 1560 

45 

AATGTTAATT CAATACAATA CCGTGTAGCA AATGAAAAAG TGAGGTTTGC ACGTTTTAGA 1620 

CCATTCCCAT TCTGGAAACG TGTGCACX3AT TCTTTCATAT CAAGTGATGA AGAACGATGA 1680 

SO AATTTAAGTA TCATATATCA CAACAAGAAA CTGTTAAAAC TTTTTTAGCA CGACATGATT 1740 

TTTCTAAGAA GACAGTGAGC GCCATTAAAA ATAATGGCGC TTTAATTGTT AATGATGAAC 1800 
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10 



IS 



AAATACCGAG TGTTAATTTA ATACCTTATG CTCGTAAGCT AGAAGTATTG TATGAAGATG 1920 

CTTTTATCAT CATAGTTACT AAACCAAACA ATCAAAATTG TACGCCTTCG AGAGAACATC 1980 

CTCATGAAAG TTTAATCGAA CAAGTACTAT ATCATTGTCA GGAACATGGT GAAAATATTA 2040 

ACCCACATAT TGTTACGCX3T CTAGATCGTA ATACAACTGG TATTGTGATA TTCGCTAAAT 2100 

ATGGACATAT CCATCATTTA TTTTCTAAAG TAAACTTGAA AAAAATATAT ACTTGCCTTG 2160 

TATATGGTAA AACCCATACA TCTGGTATTA TTGAAGCTAA TATTAGACGG TCAAAGGATA 2220 

GGATTATAAC TAGAGAAGTT GCCTCGGATG GTAAATACGC TAAAACATCT TATGAAGTAA 2280 

TAAATCAGAA TGATAAATAC AGTTTATGCA AAGTTCATTT GGATACaOQA CGTACACATC 2340 

AAATTC6TGT ACATTTTCAA CATATTGGGC ATCCAATTGT GGGAGATTCT TTGTATGATG 2400 

GTTTTCATGA CAAAATTCAT GGTCAAGTAC TGCAATGTAC GCAAATATAT TTTGTTCATC 2460 

2^ CAATCAATAA GAACAATATT TATATTACAA TTGATTATAA GCAATTACTT AAATTATnCA 2520 

ATCAACTCTA ATnCACACAG GGGGTGTAAG TATGTCAATG AnCACAGATG AAAAAGAGCG 2580 
TGT 

2S (2) INFORMATION FOR SEQ ID NO: 126: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1818 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



2583 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

35 

ATCAAGTGAT ACATTTAACT GGTAAAGGAT TAAnAGATGC TCAAGTTAAA AAATCnGGAT 60 

ATATACAATA TGAATTTGTT AAAGAGGATT TnACAGATTT ATTnGCAATT ACGGATACAG 120 

TAATAAGTAG AGCTGGATCA AATGCGATTT ATGAGTTCTT AACATTACGT ATACCAATGT 180 

40 

TATTAGTACC ATTAGGTTTA GATCAATCCC QAGGCGACCA AATTGACAAT GCAAATCATT 240 

TTGCTGATAA AGGATATGCT AAAGCGATTG ATGAAGAACA ATTAACAGCA CAAATTTTAT 300 

^ TACAAGAACT AAATGAAATG GAACAGGAAA GAACTCGAAT TATCAATAAT ATGAAATC6T 360 

ATGAACAAAG TTATACGAAA GAAGCTTTAT TTGATAAGAT GATTAAAGAC GCATTGAATT -420 

AATGGGGGGT AATGCTTTAT GAGTCAATGG AAACGTATCT CTTTGCTCAT CGTTTTTACA 480 

SO TTGGTTTTTG GAATTATCGC GTTTTTCCAC GAATCAAGAC TTGGGAAATG GATTGATAAT 540 

GAAGTTTATG AGTTTGTATA TTCATCAGAG AGCTTTATTA CGACATCTAT CATGCTTGGG 600 
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20 



3S 



CTCATGTTAA 


AGCGCCACAA 


AATTGAAGCA 


TTATTTTTTG 


CATTAACAAT GGCATTATCT 


720 


GGAATTTTGA 


ATCCAGCATT 


AAAAAATATA 


TTCGATAGAG AAAGACCTAC ATTGCTGCX3T 


780 


TTAATTGATA 


TAACAGGATT 


TAGTTTTCCT 


AGCGGTCATG 


CTATGGGATC AACTGCATAT 


840 


TTTGGAAGTG 


GTATCTATCT 


ATTAAATCGA 


TTAAATCT^G 


GTAATTCAAA AGGTATTCTT 


900 


ATAGGGTTAT 


GTGCAGCTAT 


GATTTTATTG 


ATTTCCATAT 


CACGTGTATA TCTAGGTGTA 


960 


CATTATCCAA 


CAGATATTAT 


TGCCGGCATT 


ATTGGTGGAT 


TATTTTGcAT TATTTTATCA 


1020 


ACGTTATTAC 


TTAGAAATAA 


ATTAATAAAT 


TAAATAGTAA AAAAACAAAA GCAGTAAACC 


lOBO 


TAAAGTGTCG 


TAAGGGTTTA 


CTGCTTTTAT 


AAAACGTTGT TATAACGTAT ATTGTCTTTT 


1140 


ACGGGCATAT 


AAnAGGGGAA 


TATTTGAnAA 


TCACCAATCC AACAAGAACG AAACGTTGTG 


1200 


GGGGGGATGT 


TCTATGTGGT 


ATTGATAATC 


ATTTTCAACT ACTATTATAC ATTAGTGAGA 


1260 


ATCATTGTCA 


ATTAGAAACT 


AAAACTTTTT 


TTGAATATTT 


TTTAAGAATA GTAAATAAAA 


1320 


CGCATGATTA 


CGCTATTTTA 


GAAAATAAAA 


AAATTTGTAT 


TTCTCATTAG AATTAGAATA 


1380 


TTTAAAAGTG 


ATGAGGTTTA 


AACATTATAT 


TGTTTACATA 


CTCCTTTTGA ATTCATACAT 


1440 


TATGAAATGT 


tACTTCCAAG 


TTCAAAATCG 


CACATT6AAA 


TGATGTGTGA AATGTTTAAA 


1500 


CTACGGTCAT 


tTTGTGmAAA 


TAAAGrTAAT 


AACTATTCAT 


TTTACAATAG TGAAAAGTCA 


1560 


GTATATGACA 


ACAATTAATA 


TTGCGGTAAG 


GCCTTGTGTT 


ACAGTATTCT ATATTTAAGT 


1620 


ACTGCAATCA 


GAATTAACAG 


AATGCCATTA 


ACTGATTATT 


AAATATTTGA GTTAATAAAT 


1680 


AATTAATGAT 


TGTAGCTTGA 


AAAATTTAAA 


ACATGGTTAT 


TGATTTGTGA TAAAATTTAA 


1740 


ACGTAAACAA 


ACTAATTTAA 


AAAGCAACTA 


TTGTATAGAA AAATACAAAA TTTAAAATAT 


1800 


ATTACCTTAT 


TAGAAAAA 








1818 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12658 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
TGTTTAAACA ATAGGGGGAA TCTTATGATT GAAAAATTAG TAACCTTTTT AAATGAGGTT 60 
50 GTTTGGAGTA AGCCATTAGT TTATGGTTTG CTAATTACTG GTGTGCTATT TACATTGCGT 120 

ATgCGATTTT TTCAAGTTAG ACATTTTAAA GAAATGATTC GATTAATGTT TCAAGGAGAG 180 
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GGTACAGGTA ATATTGTCGG TGTATCTACT 
TTTTGGATGT GGATTACTGC GTTTTTAGGT 

5 

GGTCAAATAT TCAAGAGAGT TGAAAATAAT 
GAATATGGTA ITGGTGGTAA ATTTGGTAAA 
ATTATCTCAG TAGGTCTATT GCTTCCTGGT 

10 

CATAATGCGA TTCATGTTCC ACAATGGTTA 
TTAATTATTT TTGGTGGTGT ACGTATTATT 
ATGGCAATTA TTTACATACT GATGGCTGTC 

15 

CCAGCGTTAT TTGCATTAAT TTTCAAATCA 
ATOGTTGGCG CAATGATAGA GATTGGTGTT 

20 CAAGGTACAG GTCCACACGC AGCAGCGGCa 

CTAGTACAAG CATTTTCAGT TTATATTGAT 
ATTATACTTA TTTCTGGTAC ATATAATGTG 

25 CCX;CATTTAA TTAAAGATGG CGGTATTTAT 

GGTACTGCGA TGTATGCACA AGCCGGCATt 

TTTGATCCTA CTTTCTCTGG CGTAGgTTCG 

30 

GCATTTACTA CAATTTTGTC GTACTACTAC 

CGTAATCAAA ATAATCAAGT TTCATCGATA 

TTCGCTACAT TTTACGGTGC AGTTAAAACA 

35 

GGTGTAGGTC TAATGGCTTG GTTAAATATC 
GTAAATGCTT TAAAAGATTA TGAAATTCAA 
GTTTATCAAC CTGATCCGAA TAAATTACCT 

40 

GAACGTTTAA AACAAGCACG TGCCAAAAAG 
GATCATTTGA TAAAAAAGAA AAGTATTGAG 

45 AAAATATAGT GTCTCTTGGT ACAATAACAA 

GAATTTAAAA CTGGTAAGAT TAATAAACAT 
GTCACGTTAA GTATTTATTT ACCAGAATCT 

SO CTTTGCTTTG ACGGATTAGA TTTTTTACGT 

TTAATCAAAG AAGCGCGTAT TGATGATGCG 

55 



GCAATATTTA 


TAGGAGGACC 


TGGTGCAGTA 


300 


GCAAGTAGTG 


CTTTTATTGA 


ATCTACACTT 


360 


GAATACCGTG 


GTGGACCAGC 


GTATTATATT 


420 


ATTTACGGAA 


TTATCTTTGC 


TATTGTTACG 


480 


GTGCAATCTA ACGCTATAGC AAGTTCTATG 


540 


ATGGGTGGTA 


TTGTTGTAGT 


TATTTTGGGA 


600 


6CCAATGTTG 


CAACAGCCGT 


TGTACCATTT 


660 


ATTATCATTT GTATCAATAT ACAAGAAGTG 


720 


GCATTTGGAT 


TACAATCTGC 


TTTTGGTGGT 


7B0 


AAAC6TGGAT 


TATATTCAAA 


TGAGGCTGGT 


840 


gcAGaAGTAT CACATCCAAG TAAACAAGGT 


900 


ACATTATTTG 


TATGTACTGC AACTGCTCTG 


960 


ACTQATGGTA 


CX^GTTAATGC 


GAATGGCACA 


1020 


GTTgAAAATG 


CAACAGGTAA AGATTATTCA 


1080 


GATAAAGCGT 


TCCATGGCAG 


TGGTTATCAA 


1140 


TACTTTATTG 


CATTTGCTTT 


ATTCTTCTTT 


1200 


ATTACAGAAA 


CAAATGTTGC 


TTATTTAACG 


1260 


TTTATTAATA 


TTGCTCGTGT 


GATTATTTTG 


1320 


GCTGATGTAG 


CATGGGCATT 


CGGTGATTTA 


1380 


ATTGCGATTT 


GGATTTTACA 


TAAGCCTGCC 


1440 


AAGAAACGTT 


TAGGCAACGG 


TTATAATGCA 


1500 


AATGCTGTCT 


TTTGGTTGAA 


GACGTATCCA 


1560 


TAATCTACTT 


TTGTTTATAG 


TATATGTAGT 


1620 


AATTTTAGGt 


GCTCAGAAAT 


TTGAATTTTA 


1680 


TACAACTACT AGGGGCACTT 


TTTTATGTCA 


1740 


GTTTTATATA GTAATATTTT AAATAGAGAT 


1800 


TATAATCAAC 


TTGTTAAATA 


TAATGTCATT 


1860 


TTCGGGAGAA 


TACAACGTAC 


ATATGAATCG 


1920 


ATCAITGTTG 


GATTCCATTA 


TGAAGACGTT 


1980 
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GTCGGTAAAQ AAATATTGCC ATTTATTGAC TCGACGTTTT CTACACTGAA AGTAGGTAAT 


2100 




GCAAGGTTAT TAGTAGGGGA 


TAGTTTAGC6 


GGTAGTATTG 


CCTTATTAAC 


GGCGTTGACC 


2160 


5 


TATCCAACGA TTTTTAGTCG 


TGTAGCAATG 


TTAAGTCCAC 


ATTCAGATGA 


AAAAGTATTA 


2220 




GATAAGCTAA ATCAATGTGC 


AAATAAAGAA 


CAATTGACAA 


TTTGGCATGT 


CATTGGTCTA 


2260 


10 


GATGAAAAAG ATTTTACTTT 


ACCAACAAAT 


GGTAAGCGTG 


CCGATTTCTT 


AACACCGAAT 


2340 


AGAGAATTAG CTGAACAAAT 


TAAGAAATAT 


AATATAACTT 


ATTATTACGA 


TGAATTTGAT 


2400 




GGTGGTCACC AATGGAAAGA 


TTGGAAACCA 


TTGCTGTCAG 


ATATATTATT 


GTATTTTTTA 


2460 


IS 


AGTAAAAACA CAGATGATCA ACTTTATGAA TAATTTACAT TAGTAGATTT AGTATGAATT 


2520 


GTCTTCATAT AGTCTGGTCT ATAATATAAT TTATAAAAGA TTTTACTGTT TAATTTAATT 


2580 




TAAATTTGAC GAAATTGCAA AAGATGTATA ATGAATTATT TTTAATGTAA CGGTTTTCAA 


2640 


20 


AGAAATTTGA TATAATAGCA ATAGGTTAAA CAAAGGAGGA ATTCAGATGA TTTTAGGATT 


2700 




AGCATTAATT CCATCAAAGT 


CATTTCAAGA AGCGGTGGAT TCTTACCGTA AAAGATATGA 


2760 




TAAACAGTAT TCACGAATTA AACCACATGT 


GACAATTAAA 


GCGCCATTTG 


AAATTAAAGA 


2820 


2S 


TGGTGATTTA GATTCTGTCA 


TTGAACAGGT 


TAGAGCTCGT ATTAATGGTA 


TACCAGCAGT 


2880 




AGAAGTTCAT GCTACAAAAG 


CTTCTAGCTT 


CAAACCAACG 


AACAAT6TGA 


TTTACTTTAA 


2940 




AGTTGCGAAG ACXMACGACT 


TAGAAGAATT 


GTTTAATCGC 


TTTAATGGAG 


AAGATTTCTA 


3000 


30 


TGGAGAAGCT GAACATGTTT 


TTGTGCCACA 


CTTTACAATA 


GCACAAGGAC 


TATCTAGCCA 


3060 




AGAATTCGAA GATATTTTTG 


GTCaAGTAGC 


ATTAGCTGGG 


GTAGACCaTA AAGAAATTAT 


3120 




CGATGAATTA ACTTTGTTAC 


GTTTTGACGA 


TGACGAAGAT 


AAATGGAAAG 


TTATTGAAAC 


3180 


35 


GTTTAAATTA GCTTAAGTAA 


CATAATAGTA 


TTGTTAATCG 


TAGTATGTTT 


GAATTAATAA 


3240 




GAAAATGGTC ATTTTTATTG 


AATGTAATAA 


AAATGACCAT 


TTTCTTTATT 


TTAAAATACG 


3300 


40 


TTTTAACCTT ACTTAGCTTT 


TTCTCTATTT 


ACTATAAAGT 


rGCTTCCATA 


AAATACAGCT 


3360 


AAGACTAAAA AGATTAATGC 


CGAGAAATAA 


AATGTATTGT 


TTAAATTGTT 


GGTAAATTGT 


3420 




GTAATTAATC CGCCAAATAA TGGCCCTATC ATTGAGCCX5A ATCCTTGGAT 


ACTATTAAAA 


3480 


45 


ACACCCCAAG TTTCTTCTTG 


TTCATCTGAT 


TTGATAAATC 


GTGCCATAAA 


GGTATTCCAT 


3540 


GCTGGTAATA AGATGCCATA 


CATTAGACCG 


ATAGCTAAAG 


CGATAATCCA 


CAAGATGTGA 


3600 




ATATTAACAA TCATAGATAG 


AGTAAAAATT 


AATATCATGT 


ATAAAATAAA 


TCCGCTTAGA 


3660 


SO 


ATAACACCAT ACATAAAGTT 


TCTGCTGCGG 


TTATCTATTA GTTTCGATAA AAATAGCATC 


3720 




GAAACTGCAC AGCCGATACC ACCAATAATG ATTGCAACAG TATATTCAAT 


TGT6CTTACG 


3780 
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TGTAAAAGAA TACCAGGGAA CaACAATAAA TGGcGCTTTG TCACATCAAC AATTTGTCTC 3900 

AATTGAGCTT TAACTGGACG AGTATTATAA TTTGTTAACT TTACATCGAC AAAATAATAT 3 960 

5 

AATATCCATG CAATTAAAAC GACTAAAGAC ATCATGAAGG CAAAGCGTGT TGGGTGCACT 4020 

TTGATAAGTA GATTCATAAA AACCATACCT ACCAATAGGC CTAACAACCA TGAAAAATAA 4 080 

ACATAGCCCA TTTGTTTGCC ACGTTTATCT TCTTCAACAC TGGATAACAT AATGACCCAA 414 0 

10 

ATAGGACTAA CTGCAATACC GAGCATCATA GCACTAAATA TGATTACAAA AGGTGATGCT 4200 

GGAAACCAAA TAACTAAAAA TAAACTTGTA AAT6CTAAAA TAAATCCAGT CGTTAAAACG 4260 

ATTTTTGTGC CGAATTTTTT CAGTAAAAAT CCTATAACAA AGTTTGTAGA TGCATCAGCA 4320 

IS 

ATAAAATGTA TTGAAAATGC TAGAGACGTT ATTGCTACAG CAATGGATGT AACTGTTGGC 4380 

AAGAAATTAA TATAGCTTAG GATATACATG CCTCTC6CAA ATTCCATTAA AAATAAGATA 4440 

20 ATAAGCaTTA AAATGAAATT TTTATGATTA GCGTAATTAT TTAACGAAGA ATCTTGCATA 4500 

TAAAGGAACC TTTCCATAAA TCTCTTGTGG TTGTGATGAA T6ACCGATTA AATCAAGTAA 4560 

GTCTCGACAT ATTGTCTGTG TAGCATACTT AATTTTATCT TGTTCCATTG TACTAATCAT 4620 

2S GTTAGTTAAT TGCTCATTAC CGTTAGTTAA ACTTGCTACA ATTTTTATPG CTTCTTCTGG 4680 

AGTATCAGCG ATTTTACCAA AACCTTTTTC TTCAAAGTAA AGGGCATTTT CAAGCTCTTG 4740 

ACCAGGTGCA GGATTTAGGA AAATCATTGG AATACAACGG GCGAAACCTT CAGTTATTGT 4800 

^ GATACCACCA GGTTTCGTAA TCATAAGTTG ACTTGATGCC ATCCATTCAT TCATGTGTTT 4860 

GGTATAACCT AGAATCAATA CATTCTCGTT AGATTTAAAC TTAGCTGTTA AAGAACGCTT 4920 

TAGCTCTTTG CTCTTACCAC AAATCATAAC TACTTGTGCA TTTGCaCTTT tCGCTAATAT 4 980 

35 

ATCAGTAATC ATCGTGTCAA AACCTTTAGA TACACCAAAT GCACCAGCTG aCATTAAAAT 5040 

AGTSGCTTA TCTGGATCTA AGTTGTTGTC TATTAACCAC TGCTTTTGAT TAATAGGCGT 5100 

TTCAAATTTG TTATCAATAG GAATACCTGT CaCTTTAACT GTTGAAGGAT CAATACCTAC 5160 

40 

GTCTATGAAG TCTTGTTTCG TTTCTTTTGT TGCCACATAA TATCTTGTTG AATACGGCGT 5220 

AATCCAGTTT TTATGTAAGC GATAGTCT6T CATCACTGTA GCAACTGGAA TATTAATGTT 5280 

AAATTGCTCA GTTA6TACCG ACATAACTGG TGTAGGAAAC GTTAATAATA TTAAATCTGG 5340 

45 

CTTTTCTTTT ATCAATAAAT TAATTAACTT ATTAAGTCCA TAGTATTTGT AAAAACATTT 5400 

GTCTAGTTTA TCTGGGCGGC TGTAATAAAA CCCTTTGTAC ATATTTCTAA AATATTTAAA 5460 

so GCTATTGATA TACCATTTTT TACAAATAGA AGTCAAAATT GGATGAGCTT CCATAAATAA 5520 

ATCGTGCTCA ATGACGCTTA AATQGTCTAG ATTCATATCA TTAAGTTGAT TAACGATACT 5580 
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70 



75 



20 



TTGAGTAACC ATTAATAGCC ACCCTCCGTT AGTTTGAAAA TTTTATTTAA GTGTAACTTA 5700 

TTTTACGGCA TTATAAAAGA AATAAAGACG CAAAGTCGTT ACATTTATAG CAATTTTAAT 5760 

CTATAGATGA ATTGATACAA AATAAAACGT TATTTTATAA AGCAATTTAT TGTTCTATGT 5820 

TTTATTTGTA TATTTAAAAT TATCCAGTAT ACAATTATAG CATATTTTTG GAAACAATTA 5880 

TGATATTATA CCATGTTACA AGATGGTTTT AATAATTTAA GATGAGCCAT AATTGTAAAA 5940 

CTAATTCATA ATACCGTATG TTTTATTTTT AATAGTAGAA ATTAGAAAAT GCTGATTAGT 6000 

AGGATATAAC AGTGAAATTA TAAATTTATT AACATCAACA AAACX3TGTAT AATAAACATA 6060 

TTGTAGAAAA AGGAGCGGTT CAGTTTGGAT GCAAGTACGT TGTTTAAGAA AGTAAAAGTA 6120 

AAGCGTGTAT TGGGTTCTTT AGAACAACAA ATAGATGATA TCACTACTGA TTCACGTACA 6180 

GCGAGAGAAG GTAOCATTTT TGTCGCTTCA GTTGGATATA CTGTAGACAG TCATAAGTTC 6240 

TGTCAAAATG TAGCTGATCA AGGGTGTAAG TTGGTAGTGG TCAATAAAGA ACAATCATTA 6300 

CCAGCTAACG TAACACAAGT GGTTGTGCCG GACACATTAA GAGTAGCTAG TATTCTAGCA 6360 

CACACATTAT ATGATTATCC GAGTCATCAG TTAGTGACAT TTGGTGTAaC GGGTACAAAT 6420 

2S GGTAAAACTT CTATT6CGAC GATGATTCAT TTAATTCAAA GAAAGTTACA AAAAAATAGT 6480 

GCATATTTAG GAACTAATGG TTTCCAAATT AATGAAACAA AGACAAAAGG TGCAAATACG 6540 

ACACCAGAAA CAGTTTCTTT AACTAAGAAA ATTAAAGAAG CAGTTGATGC AGGCGCTGAA 6600 

TCTATGACAT TAGAAGTATC AAGCCATGGC TTAGTATTAG GACGACTGCG AGGCGTTGAA 6660 

TTTGACGTTG CAATATTTTC AAATTTAACA CAAGACCATT TAGATTTTCA TGGCACAATG 6720 

GAAGCATACG GACACGCGAA GTCTTTATTG TTTAGTCAAT TAGGTGAAGA TTTGTCGAAA 6780 

GAAAAGTATG TCGTGTTAAA CAATGACGAT TCATTTTCTG AGTATTTAAG AACAGTGACG 684 0 

CCTTATGAAG TATTTAGTTA TGGAATTGAT GAGGAAGCCC AATTTATGGC TAAAAATATT 6900 

CAAGAATCTT TACAAGGTGT CAGCTTTGAT TTTGTAACGC CTTTTGGAAC TTACCCAGTA 6960 

AAATCGCCTT ATGTTGGTAA GTTTAATATT TCTAATATTA TGGCX3GCAAT GATTGCGGTG 7020 

TGGAGTAAAG GTACATCTTT AGAAAC6ATT ATTAAAGCTG TTGAAAATTT AGAACCTGTT 7080 

GAAGGGC6AT TAGAAGTTTT AGATCCTTCG TTACCTATTG ATTTAATTAT CGATTATGCA 7140 

CATACAGCTG ATGGTAT6AA CAAATTAATC GATGCAGTAC AGCCTTTTGT AAAGCAAAAG 7200 

TTGATATTTT TAGTTGGTAT GGCAGGCGAA CGTGATTTAA CTAAAACGCC TGAAATGGGG 7260 

CGAGTTGCCT GTCGTGCAGA TTATGTCATT TTCACACCGG ATAATCCGGC AAATGATGAC 7320 

CCXy^AAATGT TAACGGCAGA ATTAGCCAAA GGTGCAACAC ATCAAAACTA TATTGAATTT 7380 
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10 



IS 



20 



GTTTTAGCAT CAAAAGGAAG AGAACCATAT CAAATCATGC CAGGGCATAT TAAGGTGCCA 7500 

CATCGAGATG ATTTAATTGG CCTTGAAGCA GCTTACAAAA AGTTCGGTGG TGGCCCTGTT 7560 

GATTAATAAA AGATTTATTG ATGAAGGTAA AACTATTGAT GTTTATTTAT TCGAAGCATT 7620 

AAATAACCAG ATAATCATTG CTATACCAGA TTGGTTTTGG TCATATCAGA TGGCAATGAC 7680 

ATTAGATGAA GAAACTTGTT TTGAAGCAAT ACTCATGCAA TTGTTTGTTT TTAAAGAAGA 7740 

GGAAGAGGCA GAATCGATTG CATCACAACT AACAGATTGG ATAGAAACAT ATAAAAAGGA 7800 

GAAAGACTAA TGAACTTAAA GCAAGAAGTT GAGTCTAGAA AGACTTTTGC GATTATTTCA 7860 

CATCCCGATG CAGGGAAAAC AACGTTAACT GAAAAACTAT TGTACTTCAG TGGTGCTATT 7920 

CGTGAAGCX3G GTACAGTTAA AGGGAAGAAG ACTGGTAAAT TTGCGACAAG TGACTCGATX5 7980 

AAAGTTGAAC AAGAGCGTGG TATTTCTGTA ACTAGTTCAG TAATGCAATT TGATTACGAT 8040 

GATTATAAAA TCAATATCTT AGATACACCA GGACATGAAG ACTTTTCAGA AGATACGTAT 8100 

AGAACATTAA TGGCAGTTGA CAGTGCTGTC ATGGTCATAG ACTGT6CAAA AGGTATTGAA 8160 

CCACAAACAT TGAAGTTATT TAAAGTTTGT AAAATGCGTG GTATTCCAAT CTTTACATTC 8220 

25 ATIAATAAAT TAGACCGAGT AGGTAAAGAA CCATTTGAAT TATTAGAT6A AATCXIAAGAG 8280 

ACATTAAATA TTGAAACATA CCCTATGAAT TGGCCAATTG GTATGGGACA AAGTTTCTTT 8340 

G6CATCATTG ATAGAAAGTC TAAAACAATT GAACCATTTA GAGATGAAGA AAATATATTA 84 00 

30 CATTTGAATG ATGATTTTGA GTTGGAAGAA GATCATGCAA TTACAAATGA TAGTGATTTT 8460 

GAACAAGCGA TTGAAGAATT AATGTTGGTT GAAGAAGCGG GTGAAGCCTT TQATAATGAC 8520 

GCGCTGTTGA GTGGAGACTT AACACCTGTA TTTTTCGGTT CAGCTTTAGC TAACTTTGGT 8580 

GTACAAAATT TCTTAAATGC ATATGTTGAT TTTGCGCCAA TGCCAAATGC GAGACAAACA 8640 

AAAGfiAGACG TTGAAGTAAG CCCGTTTGAT GATTCATTTT CAGGATTTAT CTTTAAAATT 8700 

CAAGCCAACA TGGACCCTAA ACACCGTGAT AGAATTGCCT TTATGCGTGT CGTTAGTGGT 8760 

GCATTTGAAC GTGGTATGGA TGTTACTTTG CAACGTACTA ATAAAAAGGA AAAGATCACA 8820 

CGTTCAACGT CATTTATGGC AGACGATAAA GAAACTGTGA ATCATGCTGT AGCAGGCGAT 8880 

ATCATTGGAC TATATGATAC TGGTAATTAT CAAATTGGAG ATACTTTAGT TGGTGGAAAA 8940 

CAAACCTACA GTTTCCAAGA TTTACCACAA TTTACGCCAG AAATTTTTAT GAAAGTTTCT 9000 

GCTAAAAACG TCATGAAACA GAAGCATTTC CATAAAGGTA TTGAACAATT AGTACAAGAA 9060 

GGTGCGATTC AATACTATAA AACATTACAC ACAAACCAAA TTATTTTAGG TGCTGTTGGT 9120 

CAGTTACAAT TTGAAGTTTT CGAACATAGA ATGAAAAACG AATATAATGT TGATGTTGTT 9180 
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AAGATGAACA CATCAAGATC GATTTTAGTG AAAGATAGAT ATGAOGATTT AGTATTCTTA 9300 

TTTGAAAATG AATTTGCAAC AAGATGGTTT GAAGAGAAAT TCCCTGAAAT TAAATTGTAT 9360 

^ AGTTTACTTT AACAGCTCAA TTGTATAATC GAATTTGTTA CATTAAAAAT AATTGTTTCG 9420 

TTGAAGAAAA ATAAATTGTA TATTTTAAAA GAAAAAGGTA TACTATGATG TATCAAATGA 9480 

ATAACCTATG GCATTTTGTC AGAGGGGAGT AACTTAAGAA TCATGACCGT ATAAATGaTT 954 0 

10 

CGACACTTTA TCGTCATTAC GArGATATCT TCCGGTAAAG TGGGCAATTT AAATTGCTTA 9600 

GTGAGACCTT TGCTATTTAT TTAGCATAGO TCTTTTTGTT TGTACTTAAC TTATTTATTT 9660 

AAAGGAOTTG TACATGTTAA TGGATCCAAG TTTGATCTTA CCTTATTTAT GGGTACTTGT 9720 

C6TTTTAGTA TTTTTAGAAG GCTTATTAGC AGCAGATAAC GCGATTGTTA TGGCTGTAAT 9780 

GGTTAAGCAC TTACCACCCG AACAACGTAA AAAAGCTTTG TTTTACGGTT TGTTAGGTGC 9840 

20 ATTTGTATTT AGATTTTTAG CATTATTCTT AATTAGTATT ATCGCGAACT TTTGGTTTAT 9900 

TCAAGCTGCA GGAGCGGTTT ACTTAATTTA TATGTCAATC AAAAATCTGT GGCAGTTCTT 9960 

TAAACACCCA GAAATTGAAA GTCCTGAAGC TGGAGATGAT CATCATTATG ATGAATCTGG 10020 

25 

TGAAGAGATT AAAGCAAGTA ACAAATCATT CTGGGGAACT GTGTTGAAAA TAGAATTTGC 10080 

AGATATCGCA TTTGCCATTG ATTCTATGCT TGCTGCTTTA gCTATTGCTG TAACACTTCC 10140 

TAAAGTTGGT ATTCACTTTG GTGGTATGGA CTTAGGTCAG TTCGTAGTCA TGTTCCTAGG 10200 

30 

TGGAATGATT GGTGTTATTC TAATGCGTTA TOCAGCAACA TGGTTTGTAG AGCTATTAAA 10260 

CAAATATCCA GGACTTGAAG GTGCAGCCTt CGCGATCGTT GGTTGGGTAG GTGTTAAATT 10320 

AGTTGTCATG GTATTAGCGC ACCCAGACAT CGCTOTATTG CCTGAGCACT TCCCACATGG 10380 

CGTATTATGG CAATCTATTT TCTGGACAGT ACTAATTGGA TTAGTAATTA TCGGTTGGTT 10440 

AGGlfCAGTT GTTAAAAATA AAAAATCGCA TAAATAATTG ATGTGAAGCG GACAATCTTA 10500 

40 ATTTAGTTTA AGGTTGTCCT TTTTCATTTA ATTGAGTGAT TTATGAAAAA TGGATTTTGA 10560 

AGAATGTGAA TCT^AAAGATG CGATATAGTA TTAAGAAAAT GTGCCTTTTA TATTTAGCAT 10620 

TTTTTCAATA GAAATTATAT AGATTTTAAA GCAAATTAGG TGTTAATGTG TCATAATGAT 10680 

45 

AAGTGATnT ATTGAATGGA GTGGACATTA GTGGATATTG GTAAAAAACA TGTAATTCCT 10740 

AAAAGTCAGT nACCsaCGTA AGCGTCGTGA ATTCTTCCAC AACGAAGACA GAGAAGAAAA 10800 

TTTAAATCAA CATCAAGATA AACAAAATAT AGATAATACA ACATCAAAAA AAGCAGATAA 10860 

SO 

GCAAATACAT AAAGATTCAA TTGATAAGCA CGAAC6TTTT AAAAATAGTT TATCATCGCA 10920 

TTTAGAACAG AGAAACCGTG ATGTTAATGA GAATAAAGCT GAAGAAAGTA AAAGTAATCA 10980 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12B: 
TGAAATnGAA TAGTACTATT GCAAGTGTAA AGAGGTTAAT TTTTGCCnCA CGCGGGACTT 60 

AAAAAGGCAA CCACTGGTTG TGACATATCC TTATTTACAT TTATAAATAT AAGGAGGAGG 120 

TAGTAGTGAA AGACTTATTG CAAGCACAGC AAAAGCTTAT ACCGGATCTC ATAGATAAAA 180 

TGTATAAACG TTTTTCTATT CTTACTACTA TCTCAAAAAA TCAGCCTGTC GGACGTCGAA 240 

75 GTTTAAGCGA ACATATGGAT ATGACTGAAC GTGTACT6CG TTCTGAAACA GATATGCTTA 300 

AGAAACAAGA TTTGATAAAA GTTAAGCCTA CCGGAATGGA AATTACAGCT GAAGGTGAGC 360 

AACTGATTTC GCAATTGAAA GGTTACTTTG ATATCTATGC AGATGATAAT CGTCTGTCAG 420 

AAGGTATTAA GAATAAATTT CAAATTAAGG AAGTTCATGT TGTTCCTGGT GATGCTGATA 4 80 

ATAGTCAATC TGTTAAAACA GAATTAGGTA GACAAGCAGG TCAATTACTT GAAGGCATAT 540 

TACAAGAAGA CGCGATAGTT GCTGTAACTG GCGGATCCAC GATGGCATGT GTTAGTGAAG 600 

CAATXCATTT ATTACCATAT AATGTATTCT TCGTACCAGC CAGAGGTGGA CTAGGCGAAA 660 

ATGTTGTCTT TCAGGCAAAC ACAATTGCAG CCAGTATGGc aCAACAAGCT GGCGGTTATT 720 

ATACGACGAT GTATGTACCT GATAATGTCA GTGAAaCAAC ATATAATACA TTGTTGTTAG 780 

AGCCATCAGT CATAAACACT TTAGACAAAA TTAAACAAGC AAACGTTATA TTACACGGCA 840 

TTGGTGATGC GCTGAAGATG GCGCATCGAC GTCAATCACC TGAAAAGGTC ATTGAACAAC 900 

35 TTCAACATCA TCAAGCTGTC GGAGAGGCAT TTGGTTATTA TTTTGATACA CAAGGTCAAA 960 

TTGTCCATAA GGTTAAAACA ATTGGACTTC AATTAGAAGA CCTTGAATCA AAAGACTTTA 1020 

TTTTTGCAGT TGCAGGAGGC AAATCGAAAG GTGAAGCAAT TAAAGCATAC TTGACGATTG 1080 

CACCCAAGAA TACAGTGTTA ATCACTGATG AAGCCGCAGC AAAGATAATA CTTGAATAAG 1140 

AGATAAAAAG TTTAATACTT TTTAAATATC ATTTTAAAGG AGGCCATTAT AATGGCAGTA 1200 

AAAGTMCAA TTAATGGTTT TGGTAGAATT GGTCGTTTAG CATTCAGAAG AATTCAAGAA 1260 

GTAGAAGGTC TTGAAGTTGT AGCAGTAAAC GACTTAACAG ATGACGACAT GTTAGCGCAT 1320 

TTATTAAAAT ATGACACTAT GCAAGGTCGT TTCACAGGTG AAGTAGAGGT AGTTGATGGT 1380 

50 GGTTTCCGCG TAAATGGTAA AGAAGTTAAA TCATTCAGTG AACCAGATGC AAGCAAATTA 1440 

CCTTGGAAAG ACTTAAATAT CGATGTAGTA TTAGAATGTA CTGGTTTCTA CACTGATAAA 1500 

GATAAAGCAC AAGCTCATAT TGAAGCAGGC GCTAAAAAAG TATTAATCTC AGCACCAGCT 1560 
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ACAGTTGTTT CAGGTGCTTC ATGTACTACA AACTCATTAG CACCAGTTGC TAAAGTTTTA 1680 

AACGATGACT TTGGTTTAGT TGAAGGTTTA ATGACTACAA TTCACGCTTA CACAGGTGAT 1740 

5 

CAAAATACAC AAGACGCACC TCACAGAAAA GGTGACAAAC GTCGTGCTCG TGCAGCGGCA 1800 

GAAAACATCA TCCCTAACTC AACAGGTGCT GCTAAAGCTA TCGGTAAAGT TATTCCTGAA 1860 

ATCGATGGTA AATTAGATGG TGGTGCACAA CGTGTTCCTG TAGCTACAGG TTCATTAACT 1920 

10 

GAATTAACAG TAGTATTAGA AAAACAAGAC GTAACAGTTG AACAAGTTAA CXSAAGCTATG 1980 

AAAAATGCTT CAAACGAATC ATTCGGTtAC ACTGAAGACG AAATCGTTTC TTCAGACGTT 2040 

IS GTAGGTATGA CTTACGGTTC ATTATTCGAC GCTACACAAA CTCGTGTAAT GTCAGTTGGC 2100 

GACCGTCAAT TAGTTAAAGT TGCAGCTTGG TATGATAACG AAATGTCATA TACTGCACAA 2160 

TTAGTTCGTA CATTAGCATA CTTAGCTGAA CTTTCTAAAT AATTTTAGTA TAGTTTTTAT 2220 

TCAAATACGC TAGTGCTCAG AACTATTTAG CATTAATTAA AGCTTATGAG TAAGCGGGGA 2280 

GCACAAACGC TTCTCCGCTT ATTTTTATAT AAAATTTCCT AATTACAAGG AGGAAACACC 2340 

ATGGCTAAAA AAATTGTTTC TGATTTAGAT CTTAAAGGTA AAACAGTCCT AGTACGTGCT 2400 

2S 

GATTTTAACG TACCTTTAAA AGACGGTGAA ATTACTAATG ACAACCGTAT CGTTCAAGCT 2460 

TTACCTACAA TTCAATACAT CATCGAACAA GGTGGTAAAA TCGTACTATT TTCACATTTA 2520 

GGTAAAGTGA AAGAAGAAAG TGATAAAGCA AAATTAACTT TACGTCCAGT TGCTGAAGAC 2580 

30 

TTATCTAAGA AATTAGATAA AGAAGTTGTT TTCGTACCAG AAACACGCGG CGAAAAACTT 2640 

GAAGCTGCTA TTAAAGACCT TAAAGAAGGC GACGTATTAT TA6TTGAAAA TACACGTTAT 2700 

^ GAAGATTTAG ACX3GTAAAAA AGAATCTAAA AATGATCCAG AATTAGGTAA ATACTGGGCA 2760 

TCTTTAGGTG ATGTGTTTGT AAATGATGCT TTTGGTACTG CGCATCGTGA GCATGCATCT 2820 

AATOTTGGTA TTTCTACACA TTTAGAAACT GCAGCTGGAT TCTTAATGGA TAAAGAAATT 2880 

40 AAGTTTATTG GCGGCGTAGT TAACGATCCA CATAAACCAG TTGTTGCTAT TTTAGGTGGA 2940 

GCAAAAGTAT CTGACAAAAT TAATGTCATC AAAAACTTAG TTAACATAGC TGATAAAATT 3000 

ATCATCGGCG GAGGTATGGC TTATACTTTC TTAAAAGCGC AAGGTAAAGA AATTGGTATT 3060 

4S 

TCATTATTAG AAGAAGATAA AATCGACTTC GCAAAAGATT TATTAGAAAA ACATGGTGAT 3120 

AAAATTGTAT TACCAGTAGA CACTAAAGTT GCTAAAGAAT TTTCTAATGA TGCCAAAATC 3180 

ACTGTAGTAC CATCTGATTC AATTCCAGCA GACCAAGAAG GTATGGATAT TGGACCAAAC 3240 

SO 

ACTGTAAAAT TATTTGCAGA TGAATTAGAA GGTGCGCACA CTGTTGTATG GAATGGACCT 3300 

ATGGGTGTAT TCGAGTTCAG TAACTTTGCA CAAGGTACAA TTGGTGTATG TAAAGCAATT 3360 
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TCTTTAGGTT TTGAAAATGA CTTCACTCAT 

TACCTAGAAG GTAAAGAATT GCCTGGTATC 

5 

AGTTTAAAGT GATGTGGCAT GTTTGTTTAA 

CATCGTGTTT CATCACTTTT CAAAAATATT 

ACCAATTATA GCTGGTAACT GGAAAATGAA 

10 

AATACATTAC CAACACTACC AGATTCAAAA 
ATTCAATTAG ATGCATTAAC TACTGCAGTT 
GGTGCTCAAA ATACGTATTT CGAAGATAAT 
GCATTAGCAG ATTTAGGCGT TAAATACGTT 
TTCCACGAAA CAGATGAAGA AATTAACAAA 
20 ACTCCAATTA TATGTGTTGG TGAAACAGAC 

GTTGTAGGTG AGCAAGTTAA GAAAGCTGTT 
GTTGTAATTG CTTATGAACC AATCTGGGCA 
GATGCAAATG AAATGTGTGC ATTTGTACGT 
GTATCAGAAG CAACTCGTAT TCAATATGGT 
TACATGGCAC AAACTGATAT TGATGGGGCA 

30 

GATTTCGTAC AATTGTTAGA AGGTGCAAAA 
TTATTTTAGA TGGTTTTGCG AACCGCGAAA 
ACAAGCCTAA TTTTGATCGT TATTACAACA 

35 

GCTTAGATGT TGGACTACCT GAAGgACAAA 
TCGGTGCAGG ACXSTATCGTT TATCAAAGTT 
40 GTGATTTCTT TGAAAATGAT GTTTTAAATA 
CAGCGTTACA CATCTTTGGT TTATTGTCTG 
TATTTGCTTT GTTAGAACTT GCTAAAAAAC 
TTTTAGATGG CCGTGACGTA GATCAAAAAT 
CTAAATTCAA TGAATTAGGC ATTGGTCAAT 
TGGATCGTGA CAAACX3TTGG GAACGTGAAG 

SO 

ATGCCCCAAC TTATGCAACT GCCAAAGAAG 
CTGACGAATT CGTAGTACCA TTCATCGTTG 

55 



ATTTCAACTG GTGGCX3GCGC GTCATTAGAG 3480 

AAAGCAATCA ATAATAAATA ATAAAGTGAT 3540 

CATTGTTACG GGAAAACAGT CACAAGATGA 3600 

TACAAAACAA GGAGTGTCTT TAATGAGAAC 3660 

CAAAACAGTA CAAGAAGCAA AAGatTCGTC 3720 

GAA6TAGAAT CAGTAATTTG TGCACCAGCA 3780 

AAAGAAGGAA AAGCACAAGG TTTAGAAATC 3840 

GGTGCGTTCA CAGGTGAAAC GTCTCCAGTT 3900 

GTTATCGGTC ATTCTGAACG TCGTGAATTA 3960 

AAAGCGCACG CTATTTTCAA ACATGGAATG 4020 

GAAGAGCGTG AAAGTGGTAA AGCTAACGAT 4080 

GCAGGTTTAT CTGAAGATCA ACTTAAATCA 4140 

ATCGGAACTG GTAAATCATC AACATCTGAA 4200 

CAAACTATTG CTGACTTATC AAGCAAAGAA 4260 

GGTAGTGTTA AACCTAACAA CATTAAAGAA 4320 

TTAGTAGGTG GCGCATCACT TAAAGTTGAA 4380 

TAATCATGGC TAAGAAACCa ACTGCGTTAA 4440 

GCGAACATGG TAATGCGGTA AAATTAGCAA 4500 

AATATCCAAC GACTCAAATC GAAGCGAGTG 4560 

TGGGTAACTC AGAAGTTGGT CATATGAATA 4620 

TAACTCGAAT CAATAAATCA ATTGAAGACG 4680 

ATGCAATTGC ACACGTGAAT TCACATGATT 4740 

ACGGTGGTGT ACACAGTCAT TACAAACATT 4800 

AAGGTCTTGA AAAAGTTTAC GTACACGCAT 4860 

CCGCTTTGAA ATACATCGAA GAGACTGAAG 4920 

TTGCATCTGT GTCTGGTCGT TATTATGCAA 4980 

AAAAAGCTTA CAATGCTATT CGTAATTTTG 5040 

GTGTAGAAGC AAGCTATAAT GAGGGCTTAA 5100 

AGAATCAAAA TGACGGTGTT AATGATGGAG 5160 
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CGAACAGAGC 


ATTCGAAGGC 


TTTAAAGTTG 


AACAAGTTAA 


AGACTTATTC 


TATGCAACAT 


5280 


TCACTAAGTA 


TAATGACAAT 


ATCGATGCGG 


CTATCGTCTT 


CGAAAAAGTT 


GATTTAAATA 


5340 


ATACAATTGG 


TGAAATTGCA 


CAAAATAACA 


ATTTAACTCA 


ATTACGTATT 


GCAGAAACTG 


5400 


AAAAATACCC 


TCACGTTACT 


TACTTTATQA 


GTGGTGGACG 


TAACX3AGGAA 


TTTAAAGGTG 


5460 


AACGCCGTCG 


TTTAATTGAT 


TCACCTAAAG 


TTGCAACGTA 


TGACTTGAAA 


CCAGAAATGA 


5520 


GTGCTTATGA 


AGTTAAAGAT 


GCATTATTAG 


AAGAGTTAAA 


TAAAGGTGAC 


TTGGACTTAA 


5580 


TTATTTTAAA 


CTTTGCTAAC 


CCTGATATGG 


TTGGACATAG 


TGGTATGCTT 


GAGCCGACAA 


5640 


TCAAAGCAAT 


CGAAGC6GTT 


GATGAATGTT 


TAGGAGAAGT 


GGTTGATAAG 


ATTTTAGACA 


5700 


TGGACGGTTA 


TGCAATTATT 


ACTGCTGACC 


ATGGTAACTC 


TGATCAA6TA 


TTGACGGaTG 


5760 


ATGATCAACC 


AATGACTACG 


CAwACAACGA 


ACCCAGTACC 


AGTGATT6TA ACAAAAGAAG 




GCGTTACACT 


TAGAGAAACT 


GGTCGCTTAG 


GTGACTTAGC 


ACCTACATTA 


TTAGATTTAT 


5880 


TAAATGTAGA 


ACAACCTGAA 


GATATGACAG 


GTGAaTCTTT 


AATTAAACAC 


TAATATTGTA 


5940 


AAAGATGTTA 


AGTAAACGCT 


TAATGACACT 


TATTTTTTGA 


AAATAATAGT 


AATATCnTTT 


6000 


TGTTAAATGA 


AAGAATAAAG 


CTATAATAAT 


TATAGAATAA 


CTATTTAn 




6048 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 5602 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

3S 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 129: 
AAAQAGTGC AAGATATCAT CGCATTAATT AAGTCGTTAC AAAgTGTAAT TGTAGACaTC 60 
40 GCTTCCAATA ATGTTGATAC AATTATGCCT GGTTATACTC ATTTACAGCG TGCACAGCCA 120 

ATTTCATTTG CACATCATAT TATGACTTAT TTTTGGATGT TACAACGAGA CCAACAACGA 180 
TTTGAAGATA GTTTAAAACG AATCGATATT AATCCTTTAG GTGCAGCAGC CTTAAGTGGT 240 

45 

ACCACATACC CTATCGATAG ACACGAGACA ACAGCATTGT TGAACTTTGG CAGTCTCTAT 300 
GAGAATAGCC TAGATGCTGT TAGTGACAGA GACTATATTA TTGAAACATT 6CATAATATT 360 
TCTTTAACGA TGGTTCACTT ATCACGCTTT GCAGAGGAAA TTATTTTCTG GTCCACAGAC 420 

SO 

GAAGCTAAAT TCATTACATT ATCAGATGCA TTTTCAACTG GCTCATCTAT TATGCCACAA 480 
AAGAAAAATC CTGATATGGC AQAATTAATT AGAGGTAAAG TTGGTCGAAC GACTGGTCAT 540 
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10 



GAAGATAAAG AAGGTTTATT CGATGCTGTC CATACAATTA AAGGTTCTTT ACGTATTTTC 660 

GAAGGTATGA TTCAAACGAT GACAATTAAT AAAGAACGAC TCAATCAAAC TGTTAAAGAA 720 

GATTTTTCAA ATGCAACGGA ACTAGCAGAT TATTTAGTAA CTAAAAATAT TCCATTTAGA 780 

ACTGCACATG AAATTGTAGG AAAAATCGTC TTAGAATGTA TACAACAAGG TCATTATTTA 840 

TTAGATGTTC CTTTAGCAAC ATATCAACAA CATCATTCTA GTATTGATGC CGATATTTAC 900 

GATTATTTGC AGCCTGAAAA TTGTTTAAAA CGACX3TCAAA GTTACGGTTC AACAGGTCAA 960 

TCATOGGTCA AACAACAACT TGATGTTGCT AAACAATTAC TATCACAATA AATACGTTAA 1020 

,5 TCTACCTACC CACAATGTCT ATTAAAATTA CATTGTGGGT ATTTTAATGC TCTCTTCGTC 1080 

TTGTTGAACA TCACAnTTT AAGATTCCTA AAATGTTTGA TAATTCTTTT AAATTTATAT 1140 

TACAAAAATG TTATAAATTG TAAAAGAAAT GTGTAAAGCG TTTTCACAAG CAGGTTTTTG 1200 

TAGTATTTTA AAATTGTTAG ACTACAAATA AAGAGATGAA AGGATAAAGA CTATGACTAA 1260 

CTCTTCGAAA AGCTTCACTA AATTTATGGC TGCTTCTGCT GTTTTTACTA TGGGATTTTT 1320 

ATCAGTACCT ACTGCTGGCG CTGAACAAAC AAATCAAATT GCAAATAAAC CTCAGGCTAT 1380 

TCAATGGCAT ACAAATTTAA CGAATGAGCG ATTCACTACT ATCGCACATC GTGGCGCAAG 1440 

TGGCTATGCA CCCGAGCATA CGTTTCAAGC ATATGATAAG AGTCATAATG AGTTAAAAGC 1500 

ATCTTATATC GAAATTGATT TACAACGTAC CAAAGATGGC CATTTAGTTG CTATCCATGA 1560 

TGAAACTGTT AACCGTACAA CAAATGGACA CGGTAAAGTT GAGGATTATA CCCTTGATGA 1620 

ATTAAAACAG TTAGATGCAG GAAGTTGGTT TAATAAAAAA TATCCAAAAT ACX3CAAGAGC 1680 

AAGTTATAAA AATGCTAAAG TACCCACTTT AGATGAAATT TTAGAACGTT ATGGCCCGAA 1740 

TGCAAACTAT TATATTGAAA CAAAGTCACC TGATGTATAC CCAGGAATGG AAGAACAATT 1800 

ATTMCTTCA TTGAAAAAGC ATCACCTTTT AAATAACAAT AAATTAAAAA ATGGACATGT 1860 

40 AATGATTCAA TCATTTTCTG ACGAAAGTTT AAAGAAAATT CATCGTCAAA ATAAGCATGT 1920 

GCCATTAGTA AAATTAGTTG ATAAAGGTGA ACTACAACAA TTTAACGACC AACGCTTAAA 1980 

AGAGATACGC TCTTATGCGA TTGGATTAGG TCCTGATTAT ACAGATTTAA CTGAACAAAA 204 0 

TACCCATCAT TTAAAAGACT TAGGATTTAT AGTACATCCT TATACAGTGA ATGAAAAAGC 2100 

TGATATGTTA CGATTAAATA AATATGGCGT TGATGGTGTC TTTACAAATT TCGCTGATAA 2160 

ATATAAAGAA GTCATTAAGT AGTAATGTTA AACTAGAAAA CATAAATACA AAAATATAGC 2220 

TATTACTATA AAAAACAGCA GTAAGATATT TCCAAATTGA AATTATCCTA CTGCTGTCTT 2280 

TTTGGGAGTG GGACAGAAAT GATATTTTCG CAAAATTTAT TTCGTCGTCC CACCCCAACT 2340 

SS 



25 



30 
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IS 



20 



25 



30 



35 



SO 



TTGTCTGTAG 


AAATTGAGGA GCTAATTTCT 


CTGTGTCGGG 


GCTCCACCCC 


AACTTGCACA 


2460 


CTATTGTAAG 


CTGACTTTCC 


GCCAGCCTCT 


GTGTTGGGGC 


CCCX5CCAACT 


TGCACACTAT 


2520 


TGTAAGCTGA 


CTTTCCACCA 


GCCTCTGTGT 


TGGGGCCCCG 


ACTATTTTTG 


AAAAGAGCGT 


2580 


GTTACACGGG 


CATTGTTTTA 


CAGTCAACTA 


CTGCTAAAAT 


AAAATTAACG 


AGCTTAGGGC 


2640 


TTTGTTTTCT 


GTCCCAAGCT 


CGTTAAATCA 


CATATGATAA 


TTAATTATGC 


CCAACCACXy^ 


2700 


TATCTAGCTG 


CTTCTGCTGT ACGTTTAATA 


CCTATGATAT 


ATGCTGCAAG 


TCTCATATCT 


2760 


ATTTTTCGGT 


TTTGAGACAA TTCX5TAAATC 


GTATCAAATG 


COGCTTCTAA 


TTTTTCACGT 


2820 


AGCTTTTCAT 


TAACTTCTTC 


TTCAGACCAA 


TAATAACCTT 


GATTATTTTG 


TACCCATTCG 


2880 


AAGTAAGAAA 


CCGTtACACC 


ACCAGCACTT 


GCTAATACGT 


CTGGAACTAA 


TAATATACCA 


2940 


CXfTTCAGTTA AAATACGTGT 


TGCTTCTGGT 


GTTGTAGGTC 


CATTAGCAGC 


TTCAACAACG 


3000 


ATACTAGCTT 


TAATATCATG 


TGCATTGTCT 


TCTGTAATTT 


GGTTTGAAAT 


AGCCGCTGGT 


3060 


ACTAAAATGT 


CACAATCTAA 


TTCAAACAAT 


TCTTTATTTG 


AGATTGTTTC 


TTCAAATAAA 


3120 


TTTGTTACCG 


TACCAAAACT 


ATCACX3ACGG 


TCTAATAAAT 


AATCTATATC 


TAAGCCATTT 


3180 


GGATCGTGTA 


ATGCACCGTA 


AGCATCAGAG 


ATACCTACAA 


TTTTTGCACC 


TAAATCATAT 


3240 


AAGAATTTAG 


CTAAGAAACT 


TCCGGCATTA 


CCGAAACCTT 


GAATAACAAC 


CTTGGCACCT 


3300 


TCAATTTGCA 


TATTACGACG 


TTTTGCAGCT 


TGTTCAATTG 


CAATAACTAC 


ACCTAGTGCA 


3360 


GTTGATCTGT 


CGCGTCCATG 


AGAACCACCC 


AATACAATTG 


GTTTACCTGT 


GATGAAACCT 


3420 


GGTGAATTAA 


ATTTATCTAA 


TGCACTATAT 


TCATCCATCA 


TCCAAGCCAT 


AATTTGTGAG 


3480 


TTTGTAAATA 


CATCTGGTGC 


TGGAATATCT 


TTGTTCGGAC 


CTACGAATTG 


TGAAATTGCT 


3540 


CTTACATATC 


CGCGTGATAA ACGTTCAACT 


TCATGAATGC 


TCATTTGACG 


TGGATCACAA 


3600 


ACGMACCAC 


CCTTACCACC 


ACCGTATGGT 


AAGTTTACAA 


TGCCACATTT 


CAAAGTCATC 


3660 


CACATTGATA ATGCTTTTAC 


TTCTTCTTCA 


TCAACATCTG 


GGTGGAAACG 


CACGCCCCCT 


3720 


TTTGTTGGTC 


CAACAGCATC 


ATTATGTTGC 


GCACGGTAAC 


CTGTGAATGT 


TTTTACTGTG 


3780 


CCATCATCCA 


TTCGTACAGG 


GATACGCACT 


TGTAACATTC 


TTAAAGGTTC 


TTTAATTAAA 


3840 


TCGTACATTC 


CTtCGTCAAA 


TCCCAATTTA 


TGCAATGCTT 






«7 7 U W 


GAAGTTACTA 


AATTATTGTT 


CTCAGTCATG 


ATCCTTTTCG 


CCTCTTCTTT 


ACCTAATGAT 


3960 


TTCGCTTTCA 


AACATATTGT 


AACATAACGT ATTCCTTTTT 


AAAGCCCTTA 


CAAACTGATT 


4020 


GTTACAACTT 


TTTGACATTA 


TTGAAATACA 


TGTCTTATTT 


TTTCAAGTGC 


AAGGTCCAAT 


4080 


TCTTCTTTAG 


TAATAATTAA 


TGGTGGTGCA AAACGAATGA 


CAGTATCATG 


CGTTTCTTTA 


4140 
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ACACCTATAA ACAAACCACG TCCAOGGACT TCTTTAATTG ATGGATGATC AATTTGCTTT 4260 

AATTGTTCTT TAAAATAATC TCCTAATTCT AAAGAGCGGC CTGGTAAATC CTCATCAACG 4320 

ATAACATCTA ATGCAGCAAT TGATGCAGCA CAAGCAAGTG GATTACCACC AAATGTTGAA 4380 

CCATGTGAGC CAGGTGTAAA GACATCTAAT ACTTCTTTAT CTGCTAATAC AACAGAAATT 444 0 

GGGAAGACTC CACCACCTAG TGCTTTACCT AAAATATAGA CATCAGGTTT TACATTATCC 4 500 

CAATCCGTAG CAAATAATTT ACCCGAACGA CCTAATCCTG CTTGGATTTC GTCAGCAATA 4560 

AATAAGACAT TATGTTCATC ACATAATTCT CTAATTGCTT TCAAATATCC TTCTGGCGGT 4620 

ATATTTATAC CCGCTTCACC TTGAATTGGT TCTACTAAAA CTGCTGCAGT ATTTTCATTA 4680 

ATTGCAGCTT TCAATGCATC TACATCTCCA AAATCAACTT TTCTAAATCC ATCTAATAAC 4740 

GGACCATAAC CACGTTGGTA TTCTGCTTCT GAAGATAATG AAACTGGCGC CATTGTTCGA 4800 

CCATGGAAGT TACCATTAAA TGCAATGATT TCTGCTTTAT TTGGCTCAAT TCCTTTAACA 4860 

TCGTATGCCC AGCGTCGTGC TGCTTTCAAA GCTGTTTCTA CTGCTTCAGC ACCTGTATTC 4920 

ATTGGTAAAG CTTTATCTTT ACCTGCCAGT TTACAAATTT TTTCGTACCA TTCACCTAAG 4980 

2S TTATCACTAT GAAAAGCACG TGAAACTAAA GTCACTTTAT CAGCTTGATC TTTTAATGCT 5040 

TGAATAATTT TCGGATGTCT ATGACCTTGG TTAACAGCGG AATATGCAGA TAACATATCC 5100 

ATATATTTAT TGCCTTCAGG ATCTTTAACC CATACCCCTT CAGCTTcTGa AATGaCAATT 5160 

GGCAATGGTA AATAATTATG TGCTCCGTAA TGATTTGTTA ACTCAATAAT TTTTTCAGAT 5220 

TTAGTCATCA TATCTCCCCT TTTCATCATT TATAACTATT ATACATGAAA CATTATCCAA 5280 

ATAATTACAT TAGTTTTCAA AGCAGATACT TTTCCACCAA AAAAGATGAA ATAATCACTA 5340 

AGTTTCATTA AATTTGTCTA TTTTGAAAAC CCTTACATTT ATAATGACAT AATTACTTAA 5400 

ATGatTTACAA GCAAAAGAAT TGATAATTTT ACACTTAATC AAAAGTATAT TTTACTAAGA 5460 

ATATTTTTAT TTATAAATAT TGAAAACCAC TAACAAATTG CATACACAAT ATCATTAGTG 5520 

GTAACAGTTA AACACTTATT TATCTTTACG GGGTAATGGG TTAAAACCCT TnCATTAAAA 5580 

TTGGATGnCC ATAAAATTAG GG 5602 
45 12) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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TAACCCCATT TTACCTGGAA AAATCgTTTG CGATGCaATm GCaTTtGaAT ATAaATACAT 60 

TTTACGTATa GAATTATAAA AgGTTTCATT CaAATCTTAG GGTCAAAAAT GTTATAATAT 120 

TTTTATGTCA AATTTAAAAC AGTAACACTT ATTTACAAGG TTGCAATATT TTGAAGTAAT 190 

AAAGGAAGTG TCGCGTATTT TAACTTTTTC AGAGCAAAAT GCACTCGCGA AAATAGATGA 240 

TTTAATGAAT ACTTATTGCA ATCAATGTCC AATCAAAACT CGTCTGCGTA AATTAGAGGG 300 

GAAAACGAAG GCGCATCATT TTTGTATCAA TGAGTGTTCA ATAGGGAAAG AAATAAAACA 360 

ATTAGGAAAT GAACTTCAAT AGGAGGAAGT CAAATGAAAA TTATATCTAT ATCAGAAACA 420 

CCGAACCACA ACACAATQAA GATTACACTT AGTGAAAGCA GAGAAGGTAT GACATCAGAT 480 

AOGTATACTA AAGTTGATGA TTCACAGCCA GCATTTATTA ATGACATCTT AAAGGTTGAA 540 

GGCGTTAAAT CAATTTTCCA TGTTATGGAC TTTATTTCAG TAGATAAAGA AAATGACGCA 600 

20 AATTGGGAAA CAGTATTGCC AAAAGTAGA6 GCTGTATTCG AATAAATTTT TCATCAACTA 660 

GTATTCGGGG GQAATAAAGT ATATGGAAAT TTTACGTATA GAGCCAACAC CAAGTCCAAA 720 

TACAATGAAA GTTGTTTTGT CATATACAAG AGAAGACAAG TTATCTAATA CTTATAAAAA 780 

AGTAGAAGAA ACACAACCAA GATTTATAAA TCAGTTGTTA TCTATAGATG GTATCACTTC 840 

CATTTTTCAT GTCATGAACT TCTTAGCTGT TGATAAGGCA CCAAAAGCTG ATTGGGAAGT 900 

CATATTACCT GATATTAAAG CTGCTTTTTC TGATGCGAAT AAGGTTTTAG AATCTGTAAA 960 

TGAACCTCAA ATTGACAATC ATTTTGGTGA AATTAAAGCT GAATTATTAA CTTTTAAGGG X020 

TATACCGTAT CAAATTAAGC TAACTTCTGC TGACCAAGAA TTAAGAGAAC AATTACCACA 1080 

AACATATGTT GACCATATGA CTCAAGCGCA AACAGCACAT GACAATATTG TTTTTAT6CG 1140 

TAAATGGCTA GATTTAGGAA ATCGCTATGG AAATATTCAA GAAGTAATGG ATGGTGTCCT 1200 

AGAASaAGTG CTAGCTACCT ATCCAGAATC ACAGTTACCC GTATTGGTAA AACATGCTTT 1260 

40 AGAAGAAAAT CACGCAACTA ATAATTATCA TTTCTATCGA CATGTCTCTT TGGATGAATA 1320 

TCATGCAACT GATAATTGGA AGACTCGATT ACGAATGTTA AACCATTTTC CAAAGCCGAC 1380 

TTTTGAAGAT ATACCGCTGC TTGATTTAGC TTTATCTGAT GAAAAAGTAC CGGTTAGACG 1440 

TCAAGCGATT GTATTATTAG GTATGATTGA AAGTAAAGAA ATTTTACCGT ATTTATATAA 1500 

GGGGCTTCGT GATAAAAGTC CTGCTGTAAG AAGAACAGCA GGGGATTGCA TAAGCGATTT 1560 

AGGGTATCCA GAGGCACTAC CAGAAATGGT GCTACTATTA GATGATCCAC AGAAAATCGT 1620 

TAGGTGGCGT GCTGCTATGT TTATCTTTGA TGAAGGTAAT GCAGAGCAGC TTCCCX5CACT 1680 

AAAAGCCCAT ATTAATGACA ATGCGTTTGA AGTTAAATTA CAAATTGAAA TGGCCATATC 1740 
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AATTTAATTG GAGC3AATTAA ATAT6AATGC ATATGATGCT TATATGAAAG AAATTGCGCA i860 

ACAAATGCGT GGCGAATTAA CTCAAAATGG TTTTACAAGT TTAGAAACGA GCGAACAGct 1920 

ATCGGAGTAT ATGAACCAAG TAAATGCTGA TGACACTACT TTTGTAGTTA TTAACTCTAC 1980 

ATGCGGCTGT GCAGCTGGAT TAGCAAGACC AGCTGCAGTA GCAGTTGCAA CACAAAATGA 2040 

ACATAGACCT ACAAATACAG TTACAGTTTT TGCTGGGCAA GATAAAGAAG CAACTGCTAC 2100 

AATGCGAGAA TTCATTCAGC AA6CACCATC TAGTCCTTCG TATGCTTTAT TCAAAGGTCA 2160 

AGATTTAGTT TATTTTATGC CTAGAGAATT TATCGAAGGT AGAGATATTA ATGACATTGC 2220 

AATGGACTTA AAGGATGCCT TTGACXyUUUV TTGTAAATAG TACACATAAA TAAATATAAA 2280 

GGTTAACACA TTTTATAATA TTAAAAATGG TGTCTGTCAT TGAAAATAGA GAATATAGTT 2340 

GTATTCTATT TGTTAAATAA AGTCCGTTTT TACCaACTAT ATTTTCTAGA AATTTAACTG 2400 

20 TTTTAATAGG ACATCAAACA TAATATTCaA ATCaTGTGTT AACCTCTTTTT TTAAAATTTT 2460 

TTAGCATTAA AGTTATAGAT TTGGGTAAAC AATTACCAAT TGGAAACATA TATCACGTTA 2520 

CGATGGGGTA GGTACTTAAT CAGCATTTTA TAAATAAAGT AACGGAATTC ATGATATTAA 2580 

TATCATATTC CTAAAATGAG TGATAACAAA ATGCTACATA AAGTTAAGTT ATATCAAACT 2640 

AAATATACAT ACTATAAATA ATGAAAATGA GGTGTTATCG CATATGTTGA ATTCATTTGA 2700 

TGCAGCATAT CACAGTCTTT GTGAAGAAGT TTTAGAAATA GGAAATACAC GAAATGATCG 2760 

CACAAATACA GGTACGATTT CGAAATTTGG TCATCAACTT CGCTTTGACT TATCTAAAGG 2820 

ATTTCCACTA TTAACGACAA AGAAAGTTTC TTTTAAATTA GTAGCAACCG AATTATTATG 2880 

GTTCATTAAA GGAGATACAA ACATCCAATA CTTATTAAAA TATAATAATA ATATATOGAA 2940 

CQAATGGGCT TTTGAAAATT ATATCAAATC AGACGAGTAT AAAGGTCCAG ATATGACAGA 3000 

TTTC25GGCAT CGTGCATTGA GTGATCCTGA ATTTAACGAA CAATATAAAG AACAAATGAA 3060 

4Q ACAATTTAAG CAACGTATTC TTGAAGATGA TACATTTGCG AAGCAATTCG GGGATTTAGG 3120 

AAATGTTTAT GGTAAACAAT GGCGAGATTG GGTTGATAAA GATGGTAATC ATTTTGATCA 3180 

ACTTAAAACA GTAATTGAAC AAATTAAGCA TAATCCAGAT TCAAGGCGAC ACATCGTATC 324 0 

^5 TGCATGGAAT CCAACAGAAA TTGATACAAT GGCACTTCCG CCTTGTCATA CCATGTTCCA 3300 

GTTTTATGTC CAAGATGGTA AGTTAAGTTG CCAGTTATAC CAACGTAGCG CAGATATCTT 3360 

TTTAGGTGTG CCATTTAATA TCcGCagctA CGCTTTATTG ACACACCTTA TTGCCAAAGA 3420 

so 

ATGTGGACTT GAAGTGGGTG AATTTGTGCA TACATTTGGA GATGCACATA TTTATTCAAA 3480 

TCATATTGAT GCGATTCAAA CACAATTAGC ACGTGAAAGC TTCAATCCTC CAACATTAAA 3540 
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TGAATCACAT CCAGCAATAA AAGCTCCAAT 
CATATAGACA TCAAAATGAC ATCATAGTAT 
^ TAAACGTTTT CATAAATTAT GCAAAATCAT 

TGTTAAATTA AAGATAACTT AGTAATAAAA 
TGACTTTATC CATTCTAGTt GCACATGACT 

10 

TACCTTGGCA CCTACCAAAT GATTTGAAGC 
TAGTAATGGG TCGTAAGACA TTTGAATCGA 
TTGTACTTAC TTCAGATACA AGTTTCAACG 
AAGATATTTA CCAACTACCG GGCCATGTTT 
AAATGATTGA TAAAGTGGAC GACATGTATA 
20 ATACGTTCTT TCCACCTTAT mCATTkGAgr 
ACTAGATGAG AAAAATACAA TTCCACATAC 
GGGAAAACGA CCATGACAAA ACAGATTATA 
GAATACTTAG AAGCAAACAA CATTCATGTA 
TCATACGTTG ACCAAGTAGA TATTACATCA 
GAAGATGTAA AGACAAGTCA GCCAGCCATA 

30 

GGAAAAGATG GCTCTGAAAT CATAAGTATT 
AACACTGCTT ACCAAGCAAG TCAAATGGTA 
TCTATTTCTT TTGGTTTAGG GTATCAAATA 

35 

GtCTCAACTT CTGAAATAGT TAAAAAGTTA 
GTAG3TATAG GGCAATTGAA TCAATTAATT 
TTGATTGGTA ATCTTATGAA AATTAAACCA 
CTTGTGCmCA ATGCGAGAAC TCaAAATTCk 
GAATTTATAG GAGATCATGA AATCAAATCC 

^ tatgttgata aattgaagaa agtttttaat 
aatgtaacta caccagttat ttctgcacat 

CTTAAGAAGT AAATTTAATC TTTTCAGTGT 

50 

TAAATTTATA ATTAGATAGA TAGAGGAGGT 
AGCAGGAGGA TGTTTCTGGT GCATGGTTAA 

55 



AGCAGTGTAG TCATTGCATA GTTAGCTAAC 3660 

TTTCAAGTGC AAAAAAGTAC TTTTTTGTGT 3720 

TATTTCTATC ACACTTTATG ATAAAAATTG 3780 

AATGAAATGA TAGAAGAAGG AGGATAATTA 3840 

TGCAACGAGT AATTGGTTTt GAAAATCAAT 3900 

ATGTTAAAAA ATTATCAACA GGTCATACTT 3960 

TTGGTAAACC ACTACCGAAT CGTCGAAATG 4020 

TAGAnGGCGT TGATGTAATT CACTCTATTG 4080 

TCATATTTGG AGGGCAAACA TTATTTGAAG 4140 

TTACTGTTAT TGAAGGTAAA TTCCX5TGGTG 4200 

CTGGGAAGTT GCCTCTTCAG TTGAAGGTAA 4260 

CTTTCTACAT TTAATTCGTA AAAAATAAGG 4320 

GTAACAGACT CAACATCCGA TTTATCTAAA 4380 

ATTCCTTTAA GTTTAACTAT TGAAGGAGCT 444 0 

GAAGAATTTA TTAATCATAT TGAAAATGAT 4500 

GGTGAATTTA TATCTGCTTA TGAAGAACTA 4560 

CATCTTTCTT CAGGATTAAG TGGTACATAT 4620 

GATGCTAATG TAACTGTTAT TGATTCAAAA 4680 

CAACACCTAG TAGAGCTTGT AAAAgAaGGT 4740 

AATCATTTAA GAGAAAACAT TAAATTATTT 4800 

AAAGGTGGCA GAATTAGTAA AACAAAAGGT 4860 

ATTGGTACAC TAGATGATGG TCGCTTAGAG 4920 

AGTATCCAAT ACTTGAAAAA GGAAATTGCT 4980 

ATTGGTGTCG CACATGCTAA CGTCATTGAA 5040 

GAAGCTTTTC ATGTGAATAA TTACGATATA 5100 

ACTGGTCAAG GTGCGATTGG CCTCGTAGTC 5160 

TAATTACTTC CATTTCAATC CTTTATAGAC 5220 

AATTCATATG ACAAAAGAAT ATGCAACATT 5280 

ACCATTTACA TCATATCCAG GCATCAAGTC 5340 
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GAATCAAACC GGCCATGTCG AAGCAGTACA AATTACGTTT GATCCAGAGG TTACTTCCTT 5460 

TGAAAATATA TTAGACATAT ATTTCAAAAC ATTTGACCCA ACTGATGATC AAGGGCAATT 5520 

5 

TTTCGATAGA GGCGAAAGCT ATCAACCAGT CATTTTCTAT CATGATGAAC ATCAGAAAAA 5580 

GGCTGCTGAG TTTAAAAAGC AACAATTAAA TGAACAAGGT ATTTTCAAGA AACCAGTGAT 5640 

TACACCTATT AAACCATATA AAAATTTCTA TCCAGCTGAA GACTACCATC AAGATTATTA 5700 

10 

CAAAAAGAAC CCGGTACATT ATTACCAATA TCAACGTGGT TCAGGTAGAA AAGCGTTTAT 5760 

AGAATCACAT TGGGGGAATC AAAATGCTTA AAAAAGATAA AAGTGAACTA ACAGATATAG 5820 

AATATATTGT TACACAAGAn AACGGCACTG AACCACCATT TATQAATGAA TATTGGAATC 5880 

ATTTTGCTAA AGGATTTATG TAGATAAAnT TCnGGTAAAC CTTG 5924 
(2) INFORMATION FOR SEQ ID NO: 13X: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9280 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



30 



35 



45 



SO 



GGCCGTTnAA 


AATCTCCAAA 


ATAnAAAAAC 


CCATCTTGTT 


CCAATGTTTT AAAATCGCCa 


60 


TCCaACACTT 


GaTCaATAGC 


TTGCAACAAC 


GTTGAACGTG 


TTTTaCCAAA 


AGCATCaAAC 


120 


GCTCCCACTA 


AAATCAGTGC 


TTCAAGTAAC 


TTTCTCGTTT 


TGACTCTCTT 


OGGTATACGT 


180 


CTAGCAAAAT 


CAAAGAAATC 


TTTAAATTTG 


COGTTCTGAT 


AAC6TTCATC 


AACAATCACT 


240 


TTCACACTTT 


GATAACCAAC 


ACCTTTAATT 


GTACCAATTG 


ATAAATAAAT 


GCCTTCTTGG 


300 


GAAGGTTTAT 


AAAACCAATG 


ACTTTCGTTA 


ATGTTOGGTG 


GCAATATAGT 


GATACCTTGT 


360 


TTTTTTGCTT 


CTTCTATCAT 


TTGAGCAGTT 


TTCTTCTCAC 


TTCCAATAAC ATTACTTAAA 


420 


ATATTTGCGT 


AAAAATAATT 


TGGATAATGG 


ACTTTTAAAA 


AGCTCATAAT 


GTATGCAATT 


480 


TTAGAATAGC 


TGACAGCATG 


TGCTCTAGGA 


AAACCATAAT 


CAGCAAATTT 


CAGAATCAAA 


540 


TCAAATATTT 


GCTTACTAAT 


GTCTTCGTGA 


TAACCATTTT 


GCTTTGCACC 


TTCTATAAAA 


600 


TGTTGACGCT 


CACTTTCAAG 


AACAGCTCTA 


TTTTTTTTAC 


TCATTGCTCT 


TCTTAAAATA 


660 


TCCGCTTCAC 


CATAACTGAA 


GTTTGCAAAT 


GTGCTCGCTA 


TTTGCATAAT 


TTGCTCTTGA 


720 


TAAATAATAA 


CACCGTAAGT 


ATTTTTTAAT 


ATAGGTTCTA 


AATGCGGATG 


TAAATATTGA 


780 


ACTTTGCTTG 


GATCATGTCT 


TCTTGTAATG 


TAAGTTGGAA 


TTTCTTCCAT 


TGGACCTGGT 


840 
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ACACTTCTTA CACCGTCAGA CTCTAATTGG AATATGCCAG TCGTATCTCC TTGCGACAAC 960 

AATTCAAACA CTTTTTGATC ATCAAACX3GA ATCTTTTCGA TATCAATATT AATACCTAAA 1020 

5 TCTTTTTTGA CTTGTGTTAA GATTTGATGA ATAATCGATA AGTTTCTCAA CCCTAGAAAA 1080 

TCTATTTTTA ATAACCCAAT ACGTTCGGCT TCAGTCATTG TCCATTGCGT TAATAATCCT 1140 

GTATCCCCTT TCGTTAAAGG GGCATATTCA TATAATGGAT GGTCATTAAT AATAATTCCT 1200 

GCCGCATGTG TAGATGTATG TCTTGGTAAA CCTTCTAACT TTTTACAAAT ACTGAACCAG 1260 

CGTTCATGTC GATGGTTTCG ATGTACAAAC TCTTTAAAAT CGTCAATTTG ATATGCTTCA 1320 

TCAAGTGTAA TTCCTAATTT ATGTGGGATT AAACTTGAAA TTTCATTTAA TGTAACTTCA 1380 

IS 

TCAAACCCCA TAATTCTTCC AACATCTCTA GCAACTGCTC TTGCAAGCAG ATGACCQAAA 1440 

GTCACAATTC CAGATACATG TAGCTCGCCA TATTTTTCTT GGACGTACTG AATGACCCTT 1500 

TCTCQGCGTG TATCTTCAAA GTCAATATCA ATATCAGGCA TTGTTACACG TTCTGGGTTT 1560 

20 

AAAAAACGTT CAAATAATAG ATTGAATTTA ATAGGATCAA TCGTTGTAAT TCCCAATAAA 1620 

TAACTGACCA GTGAGCCAGC TGAAGAACCA CGACCAGGAC CTACCATCAC ATCATTCGTT 1680 

2s TTCGCATAAT GGATTAAATC ACTTACTATT AAGAAATAAT CTTCAAAACC CATATTAGTA 1740 

ATAACTTTAT ACTCATATTT CAATCGCTCT AAATAGACGT CATAATTAAO TTCTAATTTT 1800 

TTCAATTGTG TAACTAAGAC ACGCCACAAA TATTTTTTAG CTGATTCATC ATTAGGTGTC 1860 

30 TCATATTGAG GAAGTAGAGA TTGATGATAT TTTAATTCTG CATCACACTT TTGAGCTATA 1920 

ACATCAACCT GCGTTAAATA TTCTTGGTTA ATATCTAATT GATTAATTTC CTTTTCAGTT 1980 

AAAAAATGTG CACCAAAATC TTCTTGATCA TGAATTAAGT CTAATTTTGT ATTGTCTCTA 2040 

ATAGCTGCTA ATGCAGAAAT CGTATCGGCA TCTTGACGTG TTTGGTAACA AACATtTTGA 2100 

ATC«AACAT GTTTTCTACC TTGAATCGAA ATACTAAGGT GGTCCATATA TGTGTCATTA 2160 

TGGGTTTCAA ACACTTGTAC AATATCACGA TGTTGATCAC CGACTTTTTT AAAAATGATA 2220 

40 

ATCATATTGT TAGAAAATCG TTTTAATAAT TCAAACQACA CATGTTCTAA TGCATTCATT 2280 

TTTATTTCCG ATGATAGTTG ATACAAATCT TTTAATCCAT CATTATTTTT AGCTAGAACA 2340 

ACTGTTTCGA CTGTATTTAA TCCATTTGTC ACATATATTG TCATACCAAA AATCGGTTTA 2400 

45 

ATGTTATTTG CTATACATGC ATCATAAAAT TTAGGAAAAC CATACAATAC ATTGGTGTCA 2460 

GTTATGGCAA GTGCATCAAC ATTTTCAGAC ACAGCAAGTC TTACgGCATC TTCTATTTTT 2520 

so AAGCTTGAAT TTAACAAATC ATAAGCCGTA TGAATATTTA AATATGCCAG CATGATTGAA 2580 

TGGCCCCTTT CTATTAGTTA AGTTTTGTGC GTAAAGCTGT AGCAAGTTGC TCAAATTCAT 2640 
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CAATATCATT AATAATCAAT 


TGCCCTTTAG AACGTAATCG 


ACATCTGATT TCATTACCTT 


2760 




CATCGACTGC AAATACCCAT ATTTTCAAGC CTTTGATGTC AGCAATTGTA TTAACAAACT 


2820 


5 


GAGATGCTTC ATTTGGCTGA ATACCGAATT GCTCCAATAC ATCTTCAGTT ATTTTAACTT 


2880 




GGCAGAATCC ATCATCCATA 


AGTTCGAAAT 


GTTGTAAAAC 


ATAACCTTGA AACGGCAACA 


2940 


10 


TTTTTGGGTC CTTCTCCATC 


ATTTTATTTA 


AAAGCGCATT 


ATGATCAATA TCATGCCCAA 


3000 


TTAACTTTCC AGCAATTTCC 


ATAGTATGTT 


CTGAGGTATT 


GTTAAAAAGG AATCGCCCAG 


3060 




TATCACCGAC GATACCAAGA 


TATAAAACGC 


TCGCGATATC 


TTTATTAACA ATTGCTTCAT 


3120 


IS 


CATTAAAATG TGAGATTAAA 


TCGTAAATGA TTTCACTTGT AGATGACGCG TTCGTATTAA 


3180 




CTAAATTAAT ATCACCATAC TGATCAACTG 


CAGGATGATG ATCTATTTTA ATAAGTTTAC 


3240 




GACCTGTACT ATAACGTTCA 


TCGTCAATTC 


GTGGAGCATT 


GGCAGTATCA CATACAATTA 


3300 


20 


CAAGCGCATC TTGATATGTT 


TTATCATCAA 


TGTTATCTAA 


CTCTCCAATA AAACTTAATG 


3360 




ATQATTCCGC TTCACCCACT 


GCAAATACTT 


GCTTTT6CGG 


AAATTTCTGC TGAATATAGT 


3420 




ATTTTAAACC AA6TTGTGAA 


CCATATGCAT 


CAGGATCTGG 


TCTAACATGT CTGTGTATAA 


3480 


25 


TAATTGTATC GTTGTCTTCG ATACATTTCA TAATTTCATT 


CAAAGTACTA ATCATTTTCA 


3540 




TACTCCCTTT TTTAGAAAAG 


TTGCTTAATT 


TAAGCATTAG 


TCTATATCAA AATATCTAAA 


3600 




TTATAAAAAT TGTTACTACC 


ATATTAAACT 


ATTTGCCCGT 


TTTAATTATT TAGATATATA 


3660 


30 


TATTTTCATA CTATTTAGTT 


CAGGGGCCCC 


AACACAGAGA 


AATTGGACCC CTAATTTCTA 


3720 




CAAACAATGC aAGTTGGGGT 


GGGGCCCCAA 


CGTTTGTGCG 


AAATCTATCT TATGCCTATT 


3780 


35 


TTCTCTGCTA AGTTCCTATA 


CTTCGTCAAA 


CATTTGGCAT 


ATCACGAGAG CGCTCGCTAC 


3840 


TTTGTCGTTT TGACTATGCA 


TGTTCACTTC 


TATTTTGGCG AAGTTTCTTC CGACGTCTAG 


3900 




TATGCCAAAG OGCACTGTTA 


TATGTGATTC AATAGGTACT 


GTTTTAATAT ACACGATATT 


3960 


40 


TAAGTTCTCT ATCATGACAT 


TACCTTTlTr 


AAATTTACGC 


ATTTCATATT GTATTGTTTC 


4020 




TTCTATAATA CTTACAAATG 


CCGCTTTACT 


TACTGTTCCG 


TAATGATTGA TTAAAAGTGG 


4080 




TGAAACTTCT ACTGTAATTC 


CATCTTGATT 


CATTGTTATA TATTTGGCGA TTTGATCGTT 


4140 


45 


AATTGTTTCA CCCATCTGAG 


GCTGTCTTCC 


TAAAAGTTGC 


ATAGACTTTA AAACATCTTG 


4200 




TCTATTAATC ACACCCACTG 


TCTTTTTATT 


ACTCGAAACG 


ACAGGAATCA ATTCAATACC 


4260 




TTCCCAAATC ATCATATGCG 


CACAACTTGC 


TACTGTACTC 


ATAGCATTTA CATAAATAGG 


4320 


50 


ATTTCGCGTC ATCACTTTAT 


CTATTTCGTC 


GTCGTCCTTT 


GTATTAATCA TCTCTCGACT 


4380 




TGTTACAATA CCTACTAATT 


TATACGACTC 


ATTGACTACC 


GGAAATCTTG TATGGCCAGT 


4440 
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ATCTAATGGC GTCATTATAT CTTGAACTAT TAAGATATCT TTTCGTATTT TCTGATTAAA 4560 

AAGTGCTTTG TTGATAATAT TTGCAACTAG GAATGTATCA TAACTTGATG ATAGAACAGG 4620 

5 TAAATCATGT TCATTCGCAA AATTAATAAC TTTATTAGAT GGCTTAAATC CACCAGTAAT 4680 

TAATATAGCC GTACCTCTTT TTAAAGCTTC AATCTGCACA TCTTCACGAT TTCCGACAAT 4740 

CAATAATGTC TTTGGACCAA TATACTTTAA AATATCTTTG AGTTCCATTG CTCCAATTGC 4800 

AAATTTAGAT ACCATCTTAG TGATACCTTT GTTGCCACCT AACACTTGGC CATCAATAAT 4 860 

ATTGACAATT TCATTAAAAG TTAAATGTTC AATTTCATTA CGATTACGTT TTTCGATTCG 4920 

AACCGTACCA ACACGATCTA TCGTTGCGAC CATGCCCATT TTATCAGCAT CTTTmATTGc 4980 

IS 

ACGATATGCT GTCCCytCaG ATACGTTTAA AAATTTAGCG ATTTTACGCA CCGAAATTTT 5040 

AGAGCCTATA GATAACGATT CAATATAATC TAAAATTTGT TCATGTTTTG TCATTCTTTA 5100 

CCTCTTCTTT TCGAACAGTA TTAACTACAT TATAACTTTA TTTTGGATAA AAAGCATTGA 5160 

20 

AGTGAAATGA AATAATGATC GTTtCACCTA TTTTATTTTT TGAAAATATA CAACAAACAC 5220 

AAAGATCACA AAATCTTTAA TTTTAAATGG AAAAATCCAT TATTATTTAT TAGAATCTAA 5280 

25 GTGAGGAGGG ATGTACTAAT GTATAAAAAT ATATTACTTG GTGTAGACAC TCAGTTAAAA 5340 

AATGAAAAAG CACTAAAAGA AGTGTCTAAA TTAGCTGGCG AAGGTACAGT CGTAACAGTT 5400 

TTAAACGCAA TCAGCGAACA AGaTGCTCAA GCATCAATTA AAGCAGGTGT TCATTTAAAC 5460 

50 AAACTTACTG AAGAACGAAG CAAGCGATTG GAAAAAACAC GCAAAGCTTT AGAAGATTAT 5520 

GGTATTGATT ATGACCAAAT AATTGTTCGT GGTAATGCAA AAGAAGAACT ATTAAAACAT 5580 

GCTAATAGCG GTAAATATGA AATTGTTGTT TTAAGTAACC GTAAAGCAGA AGACAAAAAG 5640 

^ AAATTTGTAC TTGGAAGTGT CAGCCACAAA GTAGCAAAAC GTGCGACTAT CCCTGTATTA 5700 

ATCQTTAAAT AAAATTTTTA TCCAGAATCA CAAATAATCT TTCAATCATG ATGCAGTCTC 5760 

AAACGACTGA GTAAATACAA GAAACGATTA TGACTGTGGT TCTGGATTTT TTATATCGTA 5820 

40 

GTAAATTTAT AATCAATGTC TAATTGTATA AAACTAAAAT TACX3AGAGTA GGTCAGAAAT 5880 

GATAAAGAAC CACTGATGTC CCCCGTCCAC GTCGTAACTG AATCAGTAQA ATATAAAAAC 5940 

ACCCACTAAA AATATGCAGA CGATAACTTC CACATAGATT AGCGAGGTGT TTTTTAGTGT 6000 

45 

AAAATCTATA TTCTATTTAA AACTGAACAG ATTCACCTGG TTTTAAAATT TGCACGTCCC 6060 

CTACATTAAC AGCATCTTTA AATTGTTGT6 GATCTTGTTC GATTAATGGG AATGTATCAT 6120 

so AATGAATCGG TACAGAAATT TTTGGTTTAA TAAATTCATT AATAGCATAA CTTGCATCAT 6180 

CAATACCCAT CGTAAAAITA TCTCCAATTG GTACAAAACA TACATCAACT GGATGACGTT 6240 

SS 
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TTCAACTTCA AACACGATAC CCATTGGCAT ACCTAAATAA ACTGGgAATA CCATTTTCAT 6360 

GTGTAAAACT TGAACTATGA AATGCTTGAA CAAATTTAAC GCTTCCGAAA TCAAaGTTTG 6420 

5 CTTTACCACC AaTATTCATA CCATGAACAT TTTCAACACC GTGATATGAA GAAAGATAGT 6480 

CAGCCATTTC TGCACTTCCA ATTACTGTTG CTCCTGTTTT CTTTGCTAGT TCCACAACAT 654 0 

CACCAAAATG ATCAAAATGA CCGTGCX3TTA AAACGATATA GTCTACCTGC ACTGTTTCAA 6600 

10 

TATTCAAATC ACACTTAGGG TTATTTGAAA TAAACGGATC TACGATAACC TTTTTGTTGT 6660 

TCCCTTCTAA ATAAATCGTT GATTGACCAT GAAATGATAA CTTCATTTGA GCATCCTCCT 6720 

ATCAATTACT ATATAAATTT AGTACCCTTT TGCCACTTAA TTATAACAAA TTCTCAAATT 6780 

IS 

TTAAAAATTG AAAATCTAGT TAATGTATTA GCTCGATTTT GAAATCTAAT AATAATTGGC 6840 

ATAAAATGGA AGTAATATTA TGTTGAGGAG TGTTTATAAA ATGACAAAAA TATCAAAAAT 6900 

AATAGACGAA TTQAACAATC AACAAGCTGA TGCAGCATGG ATTACAACAC CGTTGAATGT 6960 

ATATTATTTT ACTGGATACC GTAGCGAACC CCATGAAAGA TTATTTGCAT TATTGATTAA 7020 

GAAAGATGGT AAACAA6TAC TATTTTGTCC AAAAATGGAA GTCGAAGAAG TCAAAGCATC 7080 

2S ACCTTTCACA GGTGAAATCG TTGGATATTT AGACACTGAA AACCCTTTTT CACTTTATCC 7140 

TCAAACAATC AATAAATTAC TAATTGAAAG C6AGCACTTA ACAGTAGCAC GCCAAAAACA 7200 

ATTAATCTCT GGTTTCAATG TCAATTCATT CGGAGATGTT GATTTAACAA TCAAACAATT 7260 

^ GAGAAATATT AAATCCGAAG ATGAAATTAG CAAAATACGT AAAGCTGCTG AGTTAGCAGA 7320 

TAAGTGTATC GAAATAGGTG TTTCTTATTT AAAAGAAGGT GTGACTGAAT GTGAAGTAGT 7380 

CAACCATATT GAGCAAACTA TCAAACAATA TGGCGTCAAT GAAATGAGTT TTGATACGAT 744 0 

35 

GGTTTTATTT GGAGATCATG CCGCATCACC TCATGGCACA CCAGGAGATC GCAGATTAAA 7500 

AAG«ATGAA TATGTACTAT TTGATTTAGG TGTAATTTAT GAGCATTATT GTAGCGATAT 7560 

GACACGTACT ATTAAATTTG GTGAACCTAG CAAAGAAGCA CAAGAAATTT ATAATATTGT 7620 

40 

ATTAGAAGCA GAAACATCTG CAATCCAAGC AATTAAACCT GGAATACCAT TAAAAGATAT 7680 

CGATCATATC GCTAGAAATA TTATTTCAGA AAAAGGTTAT GGTGAATATT TCCCTCATCG 7740 

^ CTTAGGTCAT GGCCTAGGAT TACAAGAACA TGAATATCAA GATGTTTCAA GTACTAATTC 7800 

TAATTTGTTA 6AAGCTGGCA TGGTTATTAC AATCGAACCA GGTATTTATG TACCTGGTGT 7860 

TGCAGGTGTA AGAATTGAAG ATGACATACT TGTCACTAAT GAAGGATATG AAGTATTAAC 7920 

SO ACATTACGAA AAATAAGGAG TGGGATAAAA AT6AAAAGCT TGTTACAAGC GCATTCTCAT 7980 

TCAGTCAAAC ACTGCCAATA TAACATTGTA GCGCCTAAGA CATAAATTTT TATCCAAGTC 8040 
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TGTAATGAAT CAAATCAATA TCATTCATGT TCGATGATTT CTTCX3CATTG TTTCTAGCTT 8160 

TAATTTATCA TTATTTAATT TTAATAACCA AGGAGATGAT AACGTCATTC TTTAGTACGC 8220 

5 TGTAATCCAT TCCCTTTTCA TCAAATTCAA ATTATAATTG TAAT6CTTCT TCTACAGATT B280 

TATATTCCAT TTCAAATGCC TCTGCAACGC CTTTATTGGT TACGTGACCT TTGTAAGTAT 8340 

TTAAACCTAA TGATAATGGT TGATTTGATT TAAATGCTTC TCTATACCCT TTATTAGCTA 8400 

GCATGAGCGC ATAAGGTAGC GTAgCATTAT TTAAAGCTAA CGTCGAAGTA CGCGGTACTG 8460 

CACCTGGCAT ATTTGCAACT GCATAATGAA CCACACCATG CTTAATATAT GTAGGATCAT 8520 

CATGTGTCGT AATTTTATCA GTTGtTTCAA AAATACCGCC TTGATCAATA GCAATGTCAA 8580 

15 

TAATAACTGA CCCATTTTTC ATTTGTTTAA TCATGTCTTC TGTTACAAGT CTTGGCGCTT 8640 

TAQCACCTGG AATTAAAACT GCACCTATTA CTAAATCACT TTGTTTAACA TACAACTCAA 8700 

TATTCAACGG ATTTGACATA ATTGTATGTA CACGTCCACC GAATAAATCA TCTAATTGTT 8760 

20 

GTAAACGCTT TGGATTAACA TCTAAAATCG TAACATCTGC ACCTAGTCCT AGTGCAATTT 8820 

TAGCTGCATT TGTTCCTGCT TGACCACCAC CGATAATAGT TACTTTACCC TTAGGTACTC 8880 

25 CTGGGACACC ACCTAGTAGA ATTCCCATAC CACCATTAAG TTTTTGTAGG AACTCTGCGC 8940 

CAACTTGAGC TGACATTCTT CCTGCTACCT CACTCATTGG TGATAACAAT GGTAAAGATC 9000 

GGTCTGGTAA CTGCACAGTC TCATATGCAA TACTAATTAC TTTTCTATCT ATCAAAGCTT 9060 

30 GTGTTAATTT TTCTTCATTT GCTAAATGAa gatAaGTGAA TAATACAAGC CCTTCTTTAA 9120 

AATATGGATA TTCAGATTCA AGTGGTTCTT TAACTTTAAT AACCATATCC ACATCCCAAA 9180 

CTTTTGCTTG TTCAGCAACA ATCTCAGCAC CTGCTTCTTT GTAATCTACA TCTTCAAAGA 9240 

^ ATGATCCTGA ACCCGcATTT GTTTCCACTA AAACAGTATG 9280 

(2) INFORMATION FOR SEQ ID NO: 132: 

' (i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 4669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 
CTGATTAATC TCTTGTTGTC GTGTATTTAC TAATTGAATC GTTGGTGTCT GAACACGTCC 60 
so CAGGGATAGC TGTGCATCAT ACTTTGTTGT TAGTGCACGC GTTGCATTAA TCCCAACAAT 120 
CCAATCTGCC TCACTTCTCG CTAACGCTGC ATAATACAAA TCGTTATATT GACGACCGTC 180 

55 
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ACGGATTGGC TTTTTGTTAC CAACTTTATC CAAAATCAAT CTTGCAACTA GTTCACCTTC 
TCGTCCaGCA TCTGTTGCAA TAATAATATC TTTCACTTTA TTATCTAAAA TTAACGCTTT 
TACTGTTTTA AATTGTTTGC TTGTTTTACC AATAACAACA GTTTTCATAT ATTTAGGTAT 
AATTGGAAGG TCTTCTAATC GCCATTCCTT TAAATTTTTA TCGTATTGTT CAGGTGTCGC 
ATTTGTCACT AGATGACCTA ACGCCCACGT GACAATATAT TGGTTATTTT CAAAGTAACC 
ATTACGCTTC TGATTTATTT GTAAAGCATC AGCAATATCT CTTGCGACTG ATGGTTTTTC 
AGCTAATATT AAAGATTTCA TAAATTATCC TTTCTCATAC GTTCTTTTAT TTCGAACGTG 
CTTCATCTAT TCCACTAATC TTTGATTTAA ATTCAATGAT TGCAAATGAT GTCTTAAATG 
TATTGTAACA T6TTAATATC ACTATTAACT TTCATTTCAG TTGAAATACT ATATAATAAA 
AGTAACAAAA AGTACGGAGO TAATQACATG AGCATAGTTC AGTTATATGA TATTACACAA 
ATAAAATCGT TCATTGAACA TTCGAATTAT GAATCAGCAT CATACTTATA TAAACTTCCT 
CAACAGTACA ATGAAATAGA TGTATTAATA ACCGATGCGA TTGAATCACC TGGTGTATTT 
TCGATTAAAG AAAAC6ATTC AATCAAAGCA ATCATATTGT CTTTTGCATA CX3ATAAAAAT 
AAATTCAAAG TCATAGGCCC TTTCGTGGCT GACAATTATG TATTATCTGT CGATACGTTT 
GAAACGCTAT TTAAAGCAAT GACTTCGAAC CAACCTGACG ATGCCGTCTT TAACTTTTCT 
TTTGAAGAAG GCATTCAACA ATACAAACCA TTAATGAAAG TTATTCAAGC AAGTTATAAC 
TTCACTGACT ATTACATAGA AGCCCGTACA AGATTAGAAG AAGATATGCA CCAACCAAAT 
ATCATTCCTT ATCACAAAGG GTTTTATCGT GCTTTCAGCA AATTACACAC AACTACATTT 
AAATATCAGG CACAGTCACC ACAAGATATC ATTGATAGTT TAGACGACCA TCATCATTTG 
TTTTTATTTG TTAGCGAAGG TTTACTTAAA GGTTATTTAT ACCTTGAAAT TGATTCACAA 
CAGTCAATCG CCGAGATTAA ATACTTCAGT TCTCATGTAG ATTACCGTTT GAAAGGTATC 
GCTTTCGAGT TGCTTGCGTA TGCATTGCAA TATGCTTTTG ATAATTTTGA TATTAGAAAA 
GTTTATTTTA AAATTCGTAA TAAAAATAAT AAACTCATCG AACGATTTAA TGGTCTAGGT 
TTCCATATCA ACTATGAGTA CATTAAATTC AAATTCGAAT CACGTAACGT AAAAGATCAA 
ACAATCCCTG AATAAAACAC CAAGCAAATA CCCTACAGTA CATCATTAGC ATGTATTGTG 
GGTTTTTCTA CTTTTTGTAA ATATTGAAAA TTATAAGTAG TTGTTTTTTA CTATTAGGGC 
AGAATGCTTT ACAATAACAT GCAAGTGTCA ATTAAGGGGA GCACTTGCAT AAATAGTATA 
GGAGAGTGAG TAGTCTTGCA ATTTCTTGAT TTCTTAATCG CACTTTTACC TGCTTTATTC 
TGGGGAAGTG TCGTTCTTAT TAATGTGTTC GTCQGCGGTG GACCTTACAA CCAAATTCGT 
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TTCAATAATC 


CTACT6TAAT 


TATTGTCGGT 


CTTATTTCTG 


GTGCATTATG 


GGCGTTTGGA 


2100 




CAAGCGAATC 


AGCTTAAATC 


TATTAGTTTA 


ATCGGTGTAT 


CAAATACTAT 


GCCAGTTTCT 


2160 


5 


ACAGGTATGC 


AATTAGTTGG 


TACAACATTA 


TTCAGCGTTA 


TCTTTTTAGG 


TGAATGGTCT 


2220 




TCAATGACTC 


AAATTATCTT 


TGGTTTAATC 


GCCATGATAT 


TATTAGTTAC 


TGGTGTAGCA 


2280 


10 


CTTACTTCAC 


TTAAAGCTAA 


AAATGAACGT 


CAATCAGATA 


ATCCTGAATT 


TAAAAAAGCA 


2340 


ATGGGTATTT 


TAATTGTATC 


TACAGTTGGA 


TATGTAGGTT 


TCGTTGTACT 


TGGTGACATC 


2400 




TTTGGTGTTG 


GTGGAACTGA 


TGCATTGTTC 


TTCCAATCTG 


TCGGTATGGC 


AATTGGTGGC 


2460 


IS 


TTTATCCTAT 


CCATGAATCA 


TAAAACATCA 


CTTAAATCAA 


CAGCACTTAA 


TCTATTGcCA 


2S20 






GGGGAATTGG 


TAACTTGTTC 


ATGTTCTATT 


CTCAACCAAA 


AGTTGGTGTA 


2580 




GCTACAAGTT 


TCTCATTATC 


ACAGTTACTT 


GTTATCGTTT 


CAACCTTAGG 


OGGTATTTTC 


2640 


20 


ATTTTAGGAG 


AAAGAAAAGA 


TCGTCGTCAG 


ATGACGGGTA 


TTTGGGCAGG 


TATTATTATT 


2700 




ATOGTGATAG 


CTGCTATAAT 


TCTAGGTAAT 


TTGAAATAGA 


AAGTTAAATA 


CTCATGTAAC 


2760 




GTAAAAATGT 


AATCACTTCT 


GAAAATAACC 


ATTCACTTAT 


AGAATGATTA 


AAATTAATTT 


2820 


2S 


TCXJGGAATTT 


TACGTTGAAT 


GTTCCTCTAT 


ATGTCCTAGG 


AAATACGTGG 


CTCTAAAAAC 


2880 




AAAACGCAAT 


AACACATCAT 


GACATTAATC 


ATGCGTTTTA 


AGACTTTAAA 


ATTAGCGATA 


2940 




CTTTTAAAAT 


CTTGATGATA 


TTCATATATC 


AAGTATGCGC 


CATACATATG 


AAGTGGATAG 


3000 


30 


CTGCATAACG 


CACTGCATTA 


TCAACTTGAA 


TGTATGAGTT 


GAACAACTAT 


GTCATAAATA 


3060 




AAAGCCCCCT 


TTTCACAATA 


TACATTTACA 


TATTGTGGTA 


AAGGGGGCTC 


TCATTTTCTA 


3120 


35 


CGAATACTAA 


AATGGATTTT 


ATTTTCAAAT 


GTGTAAACTA 


GACAAACACT 


GCCTGATACA 


3180 


CX3TACAAAAT 


AATGATACTA 


ATAATGATTG 


TCAAATTGGT 


CX3TCATACCT 


ATAAATGGCA 


3240 




GTGTTCGATA 


TTTAAACT6A 


ATACCATAAG 


AAATAATTGC 


AACACCTACC 


GGGAACATCC 


3300 


40 


AAGTGACCAA 


CAATGTCGTC 


TTAATCATAT 


CATCTGATAC 


TGGTAACAAG 


ACATATACTA 


3360 




ACAATCCCGC 


AACTAATGCT 


AATCCATAAT 


GCAAACATAA 


ATATTTAATA 


GTAGCAGGTA 


3420 




TATACTTTCT 


TTCCAGAGTA 


AAATTCAACA 


TGACACCTAG 


CAAAATCATT 


GATAACXKSCA 


3480 


45 


TATTTGCATG 


GGAAAGTATG 


CTAAAGAAAT 


CGATTGCCAC 


ATGTGGTAAA 


TGGATGTGAC 


3540 




TTATATTCAA 


TATAAACATT 


ACAATGTAT6 


TAACGAGTGG 


CACTGATTGT 


AATAATTTCT 


3600 




TACCTAAATA 


TTTAAAATCG 


AATTGATCAC 


TACCTTCACT 


AAAGTAGCTA 


CCTACAAAGT 


3660 


SO 


AAGTAATTCC 


AAACATCACA 


AAGGCACCAC 


CTATATCAGC 


CATAACAAAA 


TAAATAAGTC 


3720 




CCGTTTTAGG 


CCATATCACT 


TCAATTAGTG 


GATATGCAAA 


CAATCCAATA 


TTCATAGCAC 


3780 
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CAATCATTTT CGCCACAATA CCATATATAA TCATTAAAAT TGGTAAAATG GAGAATGACA 3900 

ATTTTAATTC TGCACTGTTT AAATTCACAA TAACTAAAGA TGGGAGTGTG ACATTAAGAA 3960 

CTAATGTAGC AATGACTTGA CTATCTGTTG CTTTTATAAA ATTAATGCGC TTCAAAAAGT 4020 

AACCAAGCGC AATTAATAAA ATAATCATAG TAAATTGTTC TGTCACTGTT ATCCCTTCTT 4080 

TCAATAATCT TCATAATTTA TAACTTTAAC ATACTCCACA GATATTTTAG AAGTCTACTG 414 0 

TTTCATGCTA TAATCTACAT TAAATGCACT TAATTATATT TCAAAGGAGT GTTATAGTAT 4200 

GTCTTTAGAA AACCAACTAG CCGAACTTAA ATATGATTAT GTTCGTCTTC AAGGTGACAT 4260 

AGAAAAACGG GAATCTTTGA ATTTAGATAC TTCCGCACTT GTTCGTCAAC TTAAAGATAT 4320 

TGAAAATGAA ATTAGAAACG TTCGTGCTCA AATGCAAGAT TAATAATCTA TCATTCAAGC 4380 

AATAAATGCT TTTTGTTACA TAAATTTGAC TAGCATTGCT CTGAATACGT TATATTGATG 4440 

AATTGCTTCA TTTTTCGCTC AATTACATCT AGAATCACAA GATGTT6TCG TGTTATGATT 4500 

TAGTGTTTCA TTAACAACAT ACACGCATAT CTATCCCAAC ACTGCTATTT ATGTTTTCTA 4560 

CGCTGnTGTA CTACATGAAC CCTTTGAAAC GGAGAGGAAG TTATCATATG CAATTTTAnC 4620 

TGATTTTACT AGCAATACTT TAACnAATTG nTAGTTTAAT AGAATTTTA 4669 
(2) INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2785 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

TTTQCACCCA TCTGaTACAA TGCACCATGC GGTTTAACAT GATTAATTTT AACTTGATGA 60 

ATGCGACAAA ACCCTTGTAA TGCACCTAAT TGATAAATCA TCAAATTATA AATCTCGTCG 120 

TTAGAGATAT CTATATTTCG TCTGCCAAAG CCTTTCAAAT CAGGTAAACC AGGATGTGCA 180 

CCTACTGCAA CATTATGTGC TTTGGCAAGT TTTACCGTTT CATTCATTAC ATTTTCATCA 240 

CCAGCGTGAA AACCACAAGC AACATTCGCA CTTGTAATTA ACGGAATAAT TTGATGATCA 300 

CCACCAAAGG AATAATTTCC AAATGCTTCG CCTAAATCAC AATTCAAATC AACTCGCATT 360 

ATAATTCCAC CCCTTTAACA ATTTGATGTT TTTCTAAAAA TTTAATATCA ACATCTTTTG 420 

SO CATCTCCATC ACGATATAGT GGATAATTTA AAACTGCATA TAAAAAATCG GCAGTTGTAG 480 

AAAATCCATC TATCACCATT TCATCTAAGG TGACTTTCAA CTTATCAATT GCTGAAGCTC 540 
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10 



IS 



20 



AACCGTGATA TAGTAAAGAA TCGACTCGCA CATTAAAGCC TTGAGGTAAA TGTAACGCTG 660 

TCACTTTACC TGGTGTTGGT TGAAATTTCT TTTCaGGATT TTCGGCATTT ATTCTCGCTT 720 

CTATCACATG ACCATTAAAT TGAATATCGC TTTGTGAAAA AGGTAAATGA TTATGTTCCA 780 

ATAAATACAG TTGTGCTGCA ACCAAATCAC GTTCTGCTCG CATCTCTGTA ACAGTATGTT 84 0 

CAACTTGTAT TCGAGCATTC ATTTCAATAA AGTAATGTGC GGTATCAGTT ACTAAAAATT 900 

CAATCGTACC TGCACTTCTA TAATTTGCTG CACGTGCAAC TTTAACAGCA TCGTTACATA 960 

TTTGTTGTCG TCTTTCTTCA GTTAATGCTG CACAAGGAGA TTCTTCGATT AATTTTTGAT 1020 

TTTTACGTTG TACAGAACAA TCACGTTCCC CTAAATGTAC ATAATTATCC TGCCCATCTC 1080 

CCaTAACTTG AACTTCAACA TGTTTTGcAA CAGGTATAAA AGCCTCAACA TAAACACGAT 1140 

CATCATCAAA GTATTTTTTT CCTTCACTTT TAGCTTCTTT AAATGCCTTT TCTAAATCTT 1200 

CAGCTTTCTT TACAATACGT ATACCTTTAC CACCACCGCC ACTGGCAGCT TTGATAACAA 1260 

CTGGATAACC GATGTCTTTG GCAAGATTCT CAATTTCAGA CACATGATTC ACAOCACCAT 1320 

TTGATCCTGG AATCACAGGA ACACCTGCAT GAT6AACTGT TTGTCTTGCT GTTATTTTAT 1380 

25 CCCCCATCAT TTCCATCGTT TTTTTAGTAG GCCCTATAAA CGCTATGCCT TGTTCCTCAA 1440 

CGGTTTGAGC AAATTTTGTT GATTCTGATA AAAAGCCATA TCCTGGGTGA ATTCCATTAG 1500 

CACCAGTGAT TTGTGCAGCA GATATGATGC GGTCAATATT TAAATAACTA TCTAAAgCAT 1560 

TArcwTCCCC AATACATATA GCTTGATCTG CTAAATGTAC ATGCAAGCTT TGCTCGTCCC 1620 

CTTTTGCATA AACTGCTACA GTTTCAATCC CATATTCTCT GCAAGCTCTT ATAATCCTTA 1680 

CAGCAATTTC ACCTCTGTTC GCAATTAAAC AACGAAGCAT TTACTTACCC CCTTTACTTA 1740 

ATACGTACCA AAACTTGGTC GTATTCAACA TTTGTGCCAT GATCAGCTAC TATTTCAGTA 1800 

ATTiCTCCAG CAACATCTGT TGTTACCTCG TTTAATACTT TCATCGCTTC AACATATCCT 1860 

ATAATATCTC CCTTGTTAAC TTTGTCACCG ACATTCACAA TTGGTTCAGT TAATTCTTTA 1920 

CTATCTTGTA AAAAGAATGT ACCTATCATT GGTGATTTAA TGTCATGATA ATCATTTGTC 1980 

GAAACATCGG AGTTATCATT CGCTTTTGAA GCTGTCAAAT CATTATTGTT CATACTTTGA 2040 

TTTGATTGAT TACTGTGTGC AGCCAAATGA TTCGAGTCAG TGAAGTCAAT TTCTATTTCA 2100 

TCTTCAAAAT TTTTATATTT AAATTTCTTA ACATCATTTT CCTTCACTAA TTTGATTATT 2160 

TGTTCGATTT nTTCAATATT CATTTTACAA ATCCCCTTTT AAAATTGTTG CTAATTTTTT 2220 

SO CGAAGTATGT CGCAAGCTAG ATGTATCAAA AATTGGAGTC TTTTGATGAC TCTTAAGAAT 2280 

TTCATTAAAC AGAGACATTT GTTCCCGATT CTTATCTACA GCTTCTTGGA ATGATATCCA 2340 
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TACAGTTGCA ATTTTGGTAT AACCACCTAT CGTTTGTTTA TCATTAAGCA GAATAATAGG 2460 

TTGACCATCA TTTGGTACCT GAACACTACC AAGAGCAACC GGTTCAGAAA TGATATCTGC 2520 

^ TTGATTAACT GGTGCAACGC TGTCACCTTC CAAACGATAG CCCATACGGT CTGATTGTTC 25 BO 

AGTAATTAAA TATGGATGAT TTACAATTTT CGCTCTAGCC TCTTCAGAAA ATGCCTCGAA 2640 

TTGAGGTCCT TGAAGAATGT GTATAATATT ATTTTCTGGC AATAAATCGT CCTGTAAATG 2700 

10 

AATCGTCTTT CCAATGTTTT CTTTAAAGTC ATTATTTATT TTCACTGTTA TTACATCATC 2760 

AGCTAATAAC TTTCTACCTT TGAAT 2785 

(2) INFORMATION FOR SEQ ID NO: 134: 

75 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
2^ (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

25 AATGGAAACG GTTGAAACAG CAATTATTAC TATTTCTATG GGTGAAGGTA TTTCAGAGAT 60 

ATTTAAATCA ATGGGTGCCA CACATATCAT TAGTGGTGGA CAAACGATGA ATCCTTCTAC 120 

AGAAGATATC GTTAAAGTCA TTGAACAATC AAAATGTAAA CGTGCAATTA TTTTACCGAA 180 

^ TAATAAAAAT ATCTTAATGG CAAGTGAACA AGCAGCGAGT ATTGTTGATG CAGAAGCTGT 240 

TGTTATTCCA ACGAAATCTA TTCCTCAAGG TATAAGCGCA CTATTCCAAT ATGATGTGGA 300 

CGCAACACTT GAAGaAAATA AAGCGCAAAT GGCTGATTCA GTAAATAACG TTAAATCTGG 360 

35 

TTCATTAACG TACGCTGTTC GTGATACGAA AATTGATGGC GTTGAGATTA AAAAAGACGC 420 

GTTT;&TGGGC TTGATTGAAG ATAAGATTGT AAGCAGCCAA AGTGATCAAT TAACAACGGT 480 

TACTGAGTTG TTAAATGAGA TGTTAGCAGA AGATAGTGAA ATATTGACTG TGATTATTGG 540 

40 

TCAAQATGCA GAGCAAGCAG TTACAGATAA CATGATAAAC TGGATCGAAG AGCAATATCC 600 

AGATGTAGAA GTGGAAGTTC ATGAAGGTGG ACAACCAATT TATCAATATT TCTTTTCAGT 660 

^ AGAATAAAAA TTTAAAATAA AAAACTACCA ATQATAAATC ATCAGTTGGT AGTTTTTTAT 720 

TTTGCTATTT TAGTGATATT GCGGGTTAAA AGTATCGTTC TCGAGTTGCT AACAATGTCA 780 

TGTTCAACTT AGTCATGATA AAATAAATAA CATACTAAAT GATACGTAAA ATCAAATAAA 840 

so ACATAGGTGA TTTATnTGG CTAAAGTAAA CTTAATAGAA AGTCCATATT CTCTTTTACA 900 

ATTAAAAGGT ATAGGTCCTA AGAAAATAGA AGTATTGCAA CAACTAAATA TTCATACAGT 960 
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(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 0 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

TGTAGTTGAA CATGAACAAC AAAAGAAAGA AAAGACAAAA AAGCAATACA AGCCATTTTG 60 

GATTGTCATG AGTTTTATAA TACTTATAGT TGTACTATTA CTCCCGGCAC CTTCAAGTCT 120 

GCCGATAAT6 GCTAAGGCAG TACTAGCTAT TTwAGCTTTT GCAGTTATTA TGTGGGTAAC 180 

GGAAGCTGTA TCATATCCGG TGTCAGCAAC TTTAATTATT GGCTTAATGA TATTACTTTT 240 

AGGATTTAGC CCTGTTCAAA ATTTAGGGGA GAAGCTAGGT AATCCGAAAA GTGGCAGTGC 300 

TATTTTAGCT GGAAGTGACC TTCTAGGAAC TAATCATGCA TTATCATTAG CGTTTAGTGG 360 

ATTTGCAACT TCAGCTGTAG CTCTCGTTGC AGCTGCATTA TTTTTGGCTG CTGCTATGCA 420 

25 AGAAACGAAT TTGCATAAAA GACTAGCTCT TTTAGTGTTA TCAATTGTTG GTAATAAAAC 480 

TAGAAATATA GTTATTGGAG CAATTATCGT TTCAATTGTA CTTGCATTTT TCGTTCCTTC 540 

TGCAACAGCT AGAGCAGGGG CAGTTGTACC AATCTTGCTG GGTATQATTG CGGCATTTAA 600 

AGTTTCCAAA GATAGCAAGT TAGCGTCTTT ATTAATAATT ACTTCAGTAC AAGCTGTGTC 660 

AATTTGGAAT ATTGGTATCA AAACGGCGGC AGCACAAAAT ATCGTAGCGA TTAATTTTAT 720 

AAACCATCAA TTAGGATTTG ATGTTTCATG GGGCGAGTGG TTCTTATATG CAGCGCCTTG 780 

GTCCATAGTT ATGTCCGTAG CTTTATATTT CATCATGATT AAAGTGATGC CTCCAGAAAT 840 

TAATSCAATA GAAGGTGGTA AAGATTTAAT AAAAGAAGAA TTGCATAAAC TTGGCCCCGT 900 

TAGCCCACGT GAATGGCGTT TAATTGTTAT ATCGATGTTA TTATTACTGT TTTGGTCAAC 960 

TGAAAAAGTA TTACATCCGA TTGACTCTGC ATCCATTACT ATTATTGCTT TAGGTGTTAT 1020 

GTTAATGCCG AAAATTGGTG TCATGACATG GAAACATGTT GAAAATAAAA TACCATGGGG 1080 

AACAATTATC GTGTTTGGTG TAGGTATTTC ACTAGGTAAC GTTCTTTTGA AAACAGGTGC 1140 

AGCTCAATGG TTAAGTGATC AAACTTTTGG TGTTTTAGGT TTAAAACATT TACCTATTAT 1200 

CGCGACAATT GCACTTATCA CGCTTTTTAA TATATTGATT CATTTGGGCT TTGCGAGTGC 1260 

SO AACAAGTTTA TCATCAGOGT TAATACCT6T TTTTATTTCG CTAACCTCTA CGTTACACTT 1320 

AGGAGACCAG TCTATAGGAT TTGTTTTAAT TCAACAATTT GTTATTAGTT TTGGTTTCTT 1380 
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AGATTTCTTG AAGGCAGGTA TACCATTGAC AATTGTAGGG aATAtCtAgT GaTAGTTTTT 1500 
AGCATGACTT ATTGGAAATG GGTAAGGTTG CnTTAATTAA 1540 
(2) INFORMATION FOR SEQ ID NO: 136: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11823 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

ACTTCTCACA ATAAGAAATA TGAAATTGTT ATGTGTTAGT TGAGATTCAG TGATGAATTA 60 

CTTTTATCAT TTAAAATGTT GTTATCATTG TCATGCGTTA CCAAATCGCT TACGTATACA 120 

CGATTCCCAA TCTTAACATA GACGATTT6T ATATCAGAAT TTTCTGATTA CTAACAGTTT 180 

ACCTAAGTTT AAATATCTGT TCAATGATTT TCAGTTATTT TTAAAAGAAA AATCGTAATG 240 

CTGCCATGAT AACAATCCCA CTAATAATTG TAATAGTTAA AtACGCGTGA TTATAGATAA 300 

AATAACCGTC GGAATGAGCG CGATAATGTA AGGGATGTTT AATGTATACC CCTCACCATG 360 

AGGCGTCTGT TGAATAATGC TGTCAATGAC AAGTGCCGTA AATAGTGTGA TTGGGATAAA 420 

TGATAGCCAT CGAACCACGA CATCAGGCAA TTGCACTTTT GAAATCATGA TAAAAGGTAT 480 

AATTCGAATT AATAGCGTTA CGATACCACA CAATAAAATA AGTATTAACA TGTTCATATG 540 

AGTTATCATT GTTCCATCAT CACTCCTAAC GCTGCTGAAA TTGTGGCTGC AATTAATATT 600 

GCTAGATATG AAGGCATAAA CATACTTAGC GATAACATCA TTACTATGAC GGCAATAATG 660 

AGTACTATGT AAATTCTTAA TCGCGATTTA GTAATTGATT CAAATTGCGC AATGGCCAAA 720 

AAQMAAACA TAGCCGTGAT AGCAAAATCT AACCCTAGCG TTTGCGGATT TGAGATATAT 780 

TCGCCAAATA AAGCCCCAGC TACACATGAA ATTGCCCAAA ATAAATATGC TGTGATGTTA 840 

AGACCATGCA TCCAACGATC ATTGATAGCT TCTCCTTTTA AATAAGGTGT AATGGCGACG 900 

CCAAACGTTT CGTCAGTTAC TAATGAACCT AATCCAACAC GGTTCCAAAA CCCATATGTC 960 

TTGAAGTITG GTGCAAGCGA CATACTTAAA AGGAACATTC TTGAATTTAC GATAAATACA 1020 

GTTAGTACAA TCGCTGATAT AGGTGTACCT GCTATAAACA ACGCGCACAT AATAAATTGC 1080 

GCAgcaCCGG CATATATAAC AAGACATAAC AAGACAATTT CTAAAATACT AAAGTTTTGA 1140 

GACGAAGCCA CAATACCAAA TGAAATACCA ACACCGGCAT AACCCAATAA TGTTGGGATA 1200 

CACTCTTGCA CGCCTTGTCT AAAACTTAAA TGTGTTGTCA TCTCAATTAC CTCCTTTGCC 1260 
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TAAGCAATAA CATTAGACAT CAGTTTGTCT GAGGTTAGAC ATTCCGGAGT CTTTAGTCAG 1380 

CTTCATATTA ACTTTTTATT TTTGAGAATT TTCAATTTTT TATTTAAGAC TACCTCCATA 1440 

5 TTTTCTATGG aTTTGTAGTT GTTTTTAAGT ATCAATTTTA TAAATTTTTA TATCTGATGA 1500 

TGAGTCTGGG aTATTGaTTC ATGTACCACT CCCTTaTaAT CATCCCCTCC CCCTaCCCTA 1560 

CTCCATCGAT ATAACTCATA CTACATATCA ACGAAATCAG TATTTTATCG CTTCCTTTCC 1620 

10 

TATATTAGTG ATGCTCAAAC TTGTTACGTT TTAGATTGTT TTAGTTCATC ATAATTATCC 1680 

CGTATTGTTG CTATAATGAA ATGCGTTCAC CCCATTAAAC CACAAACTTA ATTTATTGTT 1740 

GTTATGTGCA TTGGCTCACT ATTATATTTT TACAGCACAA AAAAAGTGGC GACAGTTCGT 1800 

IS 

CACCACTTTT TAAAATATTA TTTAAAGTAT CTTGCCCTTG CTTTAAGTAT ACGTAGATAT 1860 

ATACTTTTTA AAGCTTGTAG CTAAAGCCTT TATTTAACTG GTTTTGAAAT TTGTGTTTTA 1920 

CCACCCATAA ATGGTACTAA TGCTTCTGGA ATT6TTACTG TTCCATCTTC ATTTTGGTAA 1980 

TTTTCAACAA TAGCAGCAAA TGTACGTCCA ACTGCTAAAC CACTACCATT TAATGTATGT 2040 

GCTAATTCTG GTTTAGCTGC TTTGTCACGC TTGAAGCGGA TGTTAGCACG ACGCGCTTGG 2100 

25 AAATCCGTAC AGTTTGAGCA TGAACTAATT TCTTTATAAT CATTGTAGCT TGGTAACCAA 2160 

ACTTCTAAAT CATATGTTTT GCTTGCACTA AATCCAATAT CACCTGTACA TAAAATAACA 2220 

CGACGGTATG GTAAACCTAA CTCTTCTAGA ATTGCTTCTG CGTTTGTTGT CATTTCTTCT 2280 

30 AAA6CATTCC ATGAATCTTC AGGTTGTTCA AAACGTACCA TTTCCACTTT ATCGAATTGA 234 0 

TGTAAACGAA TTAATCCTCT TGTATCTCTA CCTGCTGATC CTGCTTCACT ACGGAAACAT 2400 

GCAGATTGAC CAGTGAATTT TTCAGGAAGT ACACCTGGTT GAATAATTTC ATTACGGTAG 2460 

AAATTCGTTA ATGGTACTTC AGCAGTTGGA ATTGTATATA ATCCTTCTTT TTCTACTTTA 2520 

AAT/tfVATCTT CTTCAAATTT AGGTAATTGA CCTGTACCAT ACATTGTATC TGCGTTCACA 2580 

AGCTGTGGTA CCATCATTTC TGTATAACCA TGTTGTGTTG TATGTTTTGT AATCATATAG 2640 

40 

TTCATTAAAG CACGCTCTAA TTGCGCACCT TCATTTGTTA AATATACAAA ACGCGCACCT 2700 

GAAACTTTTG CTGCACGATC AAAATCAGCC ATTTTCAATT CTTCTACAAT ATCCCAATGT 2760 

GCTTTGGGTT CAAATGAAAA CTCaCGTGGT GTACCCCACT TTTTAACTTC AACGTTATCT 2820 

4S 

TCATCAGATT CACCTTGAG6 TACATCATCA CTTATTAAAT TTGGAATACG ACAAAGGATA 2880 

CCTGTCATTT TATTATCAAT TTCATTTAAT TGACTATCTT TTTCTTTAAT ATCGTCACCT 2940 

gQ AATGTGCGCA TTTCAGCAAT CACATCATCA GCATTTTCTT TATTACGTTT TTTTAATGOG 3000 

ATTTCTTCGC TTACTTTATT ACGACGTGCT TTCATTTCTT CTGTTGCACT AATTAATTTA 3060 
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TCAATTTTGC TCTTAACTGT GTCAGGCTCA TTTCTGAATA ATCTAATGTC TAACATTAAC 3180 

CTTCATCCTT TCCCAAATAA TTATCATTTA TTATGGAATG ACGTACGTCT TTATTTTTTA 3240 

GAAAATAAAA AAAGACCACA TCCCTACAAG GGACGTGGTC TACGCGTTGC CACCCTATTT 3300 

AACAATTTAA GTTATAAAGA TACACTAAAC CTAAATTGCA CTTCACTAAA ATAACGGTTA 3360 

TCACCGATTG TTCTTTTAAA TTAAGTAGGT AGATTCATAT ATATGTTGAT TCTTGTTCAC 3420 

ACTAACCACA AGCTCTCTGA TATCGAACAC TATATATTAC TTGTCCTACG AACAATGTCT 3480 

TATTAAGTTA TTTTTAATAT AGCAAACTAT ATTTGCTTTT TCAAGTAACG ATTTCAAACA 3540 

TCACTCATGT CGATTTAGTG ACATGCAGTC GTTTGATAAA TTGATTGCTT TAAATACTGT 3600 

GCAACCGCTT CAATATCTTT ATGAAATTGA OGATCATGTG TAATGGATGG CACGATACTT 3660 

CGAAACTCAT CATACTTGCG ACGTGTTTTT GGTGATAATC CTTCAACACC TTTTAACTCT 3720 

GCTGCTTGTA AT6CAATAAC ACATTCGATT GCCAGCACAC GTCTTGCATT TTCAATAATT 3780 

TGATAACCAT GTCTAGCAGC TGTAGTTCCC ATAGATACGT GATCTTCTTG GTTCGCAGAT 3840 

GAAGTGATAG AATCAACACT CGCTGGATGC GCTAAAGTTT TATTTTCAGA AACGAGACTT 3900 

2S GCAGCAGCAT ATTGCTITAAT CATCGCGCCA CTTTGCAATC CTGGCTCTGG ACTAAGAAAT 3960 

GCTGGTAAAT CACCATTTAA TTGAGGATTT ACTAGTCGCT CTAGACGACG TTCC6ATACG 4020 

TTTGCTAATT CACTTACACC TAATTTAAOA TGATCTAATG CAAAAGCAAT AGGTTGTCCA 4080 

TGGAAGTTAC CACCTGAAAT AACAAACGTT TCATTTGCTT CCTCAAATAT AAGTGGATTA 4140 

TCATTAGCCG CATTCATTTC AAATTCTAAT TGCTGTTTAA CATAATTGAA TACTTGAAAA 4200 

CTCGCGCCAT GGATTTGTGG TATACAACGC AACGTATATG CATCTTGTAC ACGTATTTCT 4260 

GATTGTCGCG TCGTTAATGT TGATCCTTCT AACCAATCAC GCATACGCGC TGCCACATTA 4320 

ATCrerTCTT GAAAATTACG AACTGCGTGC ACATCATGTC GATATGCATC TATAATGCCA 4380 

TTAAGAGACT GATGCGTTAA TGCAGCAATC CATTCAGATT GGTAACCTAA ATCTTCTGCT 444 0 

TCTATATAAC TAAT6ACACC TTGAGCTGTC ATAGCTTGCG TACCATTAAT CAATGCTAAA 4500 

CCTTCTTTAG CCTGAAGGTT CAAAGGTTGT CTATTTAATT CTCTTAATAC ATCGTCACTA 4560 

TCCTTTTCTT CCCCTCTGTA CAATACTTTC CCTTCACCAA TTAAT6CTAA TGCTAAATGT 4620 

GATAATGGCG CTAAATCTCC TGATGCACCG AGAGAGCCTT GCTGTGGGAT TATCGGTATA 4680 

ATACGTTCAT TTATAAAAAA TTGTAATT6T CTCACTAATT CTAAAGTGGC ACCTGAATGA 4740 

so CCTTTTAATA ATGTATTCAA TCGTAAAATC ATCATGACTA ATGCTACTTd TTTTGAAAAT 4800 

GGCTCACCTA GTCCACAGGC ATGTGAGCGT ATCAGATTCA CTTGTAATTC ATTATATTGC 4860 
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10 



IS 



20 



2S 



30 



35 



40 



45 



SO 



TCCTCATTTT CAATAATACG TTCAACTACC GCTCTACTTT TTTTGACACG TTCTAACGCA 4 980 

TCATCAATAA TTTCAATCTT TGATTGTTGT TGTAAAAATG ATTTAATATC CTCAATTGTT 5040 

AGTGTTTCAC CATCTAAATA TA7UVGTCATA TATGTTACCC CCTTGTTTAT ATTAAGTAAC 5100 

CCATCCTTCT TGAAGTATAC GTTTTCATTT TTATTGAAAC AATGGTTTTA CGTACATTTA 5160 

TAACCTATTA TCAGAGCACT ATTGTAGTGC GTTAAAGGAT ATTAAGATTG TTGTAAGCAT 5220 

ATTTAATAAT TTATCTATTG ACGAATTGCA TATACAGGTA TAGTATTTTC TATTGTATTT 5280 

AACX3ACAAAT AATAATGAAT TCAGAAATTT ATAATACATT TTGTTAAAAG TTACTATATA 5340 

TTTTTAAAAT TGAATAAATT CGGAAAAGGC TTTTACATGG GAGGTTATAT CACTATGGAA 5400 

AC6TTAAATT CTATTAACAT TCCTAAGCGT AAAGAAGATT CACATAAAGG TCATTATGGC 5460 

AAAATTTTAT TAATTGGTGG ATCTGCTAAC TTAGGTGGTG CCATTATGTT AGCGGCTCGT 5520 

GCATGTGTAT TTAGCGGTA6 TGGTTTAATC ACTGTAGCTA CACATCCAAC AAATCATTCA 5580 

GCATTACATT CTCGTTGCCC AGAAGCGATG GTTATTGATA TTAATGATAC GAAAATGTTG 5640 

ACGAAAATGA TTGAAATGAC TGACAGTATA CTAATTGGTC CAGGTCTTGG CGTTGATTTC 5700 

AAAGGAAATA ATGCCATTAC ATTCCTACTA CAAAATATAC AACCGCATCA AAATTTAATC 5760 

GTAGACX3GCG ATGCGATTAC AATCTTTAGT AAACTGAAAC CGCAATTACC TACATGTCGT 5820 

GTGATCTTTA CACCACACCT CAAAGAATGG GAACGATTAA GTGGTATTCC TATTGAGGAA 5880 

CA6ACATATG AGCX5TAATCG TGAAGCAGTT GATCGTTTAG GTGCAACTGT TGTACTTAAA 5940 

AAACATGGTA CTGAAATTTT CTTTAAAGAT GAAGACTTTA AATTGACAAT CGGTAGCCCA 6000 

GCAATGGCGA CTGGTGGTAT GGGCGATACA CTTGCTGGTA TGATTACAAG CriTGTCX^GT 6060 

CAATTTGATA ACTTAAAAGA AGCGGTTATG AGTGCCACAT ATACACATAG TTTTATTGGC 6120 

GAAA^CCTTG CAAAAGATAT GTATGTGGTG CCACCATCAA GACTTATCAA TGAAATACCT 6180 

TACGCAATGA AACAATTAGA AAGTTAGTCA TTACTAATCA TTGAATATAG TAAAGCATTA 6240 

CTTTCTAGCA TAAAAATAAG ACTCCCCTAC ATATAGGGAA GTCTTATTTT TTATTATTCT 6300 

TCATCTGATG ATTGTTGTAT ATCTTCTTCA ACACGATCCA TGAAATCTTG TCTTACTTCA 6360 

ATACGTCCAT CTTCATCATT TTCTTCTGAA TCAATCACTT CAGTATGAAT TCCATTTCCT 6420 

GGTGTTTCAT CATTTaCAAC CGCTTCACGT TGTTGTTCAG TACCATCTTC AGATACAGTT 6480 

GAAGTAGATT GCTCATCTTC ATTCGTTTCA TCTTCT6CAT CTTCTTTTAC TTTAGCAACC 6540 

GTTGAAACAA ATTGATCATC ACCTAAGC6A ATTAAGCGAA CACCTTGTGC TQCACGACCA 6600 

TTTTGAGAAA TATCTGCAAC ATCTAGTCGA ATAATGACAC CTGCATTAGT AACAATCATT 6660 
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GTAGCTGTTT TAATACCTTT ACCACCACGA TTTGATAAGC GATAGTCATT AACTGGCGTA 6780 

CGTTTACCAT AACCATTTTC AGTAACTACT AATACTTCAT CAACACTGTT TGCATGAGCT 6840 

5 

ACATCAAGCC CTACAACTTC GTCACCTTCA CGAAGTGTAA TACCTTTCAC ACCCGTTGCT 6900 

GTACGGCCTA AAGGACGTAA TGTTGATTCA GGGAATCX3AA TTAATGATGC ATGTGATGTA 6960 

CCAATCAAGA TATCTTCTTG ACCACTTGTT AAGCGAACTG CAATTAACTC ATCATCTTCT 7020 

10 

CTGAACGAAA TCGCAATCTT ACCATTTCTA TTTATTCTTG AGAAGTTACT TAATGCTGAA 7080 

CGTTTAACGA CACCACGTTT AGTTGCAAAC ACTAAGAAGT TGTCTTCACT TTCAAGGTCT 7140 

TTAACAGCAA TCATT6TACT AATGACTTCA TCATTTTCAA GTTCAATAGC ATTCACTACA 7200 

G6AATACCTT TAGACTGTCT TGATAACTCA GGCACTTCGT AACCTTTAAG TTTGTATACA 7260 

CGACCTTTGT TAGTAAAGAA CAATACATGG TCATGTGTAC TTAAAGTTAC CAATTQACTC 7320 

20 ACAAAATCTT CTTCCAATGT ATTCATACCT TGAACACCAC GACCACCACG GTTTTGAGCA 7380 

CGATATGTAG ATACCGGCAA ACGTTTAATG TAGTTATTAT QGCTTAGTCT AATTACTATT 7440 

TGTTCTTCTG GAATTAAGTC TTCGTCCTCT AAGTCTTCAA ATCCACCTAA TTGAATTTCT 7500 

25 GTACC3ACGAT CATCACCGAA ACGATCTCTA ATTTCAGTCA ATTCATCTCT AACTAACTCT 7560 

AATAACACTT CTTCATCAGC TAAGATTGCT TCTAATTCAC TAATATAATT TAATAACTCA 7620 

TTATATTCAG CTTCAATTTT GTCTCTCTCT AAACCTGTTA GACGTCTTAA ACGCATGTCT 7680 

30 AAAATAGCTT GAGCTTGTTT TTCAGAAAGT TTGAAGCGTT GTTGCAAGCT TTCCATTGCA 774 0 

ACTTTATCTG TATCTGACTC ACGAATCGTT GAAATAATTT CATCGATATG GTCAAGTGCG 7800 

ATACGTAATC CTTCTAAAAT GTGGGCACGA TCTTTAGCTT TACGTAAgTT GTATTGCGTA 7860 

^ CGTCTTCTAA CAACTGTCTT TTGATGCTCT AAATAATGTA CCAACGCTTC TTTTAAATTA 7920 

ATAAGCTTCG GTCTACCATT TACAAGTGCA ATCATATTCA CACCAAATGA TGTTTGAAGA 7980 

GGTGTTTGTT TGTATAAGTT ATTTAAAATG ACACTAGCAT TTGCATCCTT ACGCACATCA 8040 

40 

ATAACGACAC GCACACCAGT ACGTAAACTT GXTTOVTCAC GTAAATCAGT GATACCGTCA 8100 

ATTTTCTTGT CACGAACGAG CTCTGCAATT TTTTCAATCA TACGAGCCTT ATTCACTTGG 8160 

AAAGGAATTT CAGTGACAAC AATACGTTGA CGTCCGCCTC CACGTTCTTC AATAACTGCA 8220 

45 

CGAGAACGCA TTTGAATTGA ACCACGACCT GTTTCATATG CACX5TCTAAT ACCACTCTTA 8280 

CCTAAAATAA GTCCAGCAGT TGGGAAATCA GGACCTTCAA TATCCTCCAT TAACTCAGCA 8340 

ATTGAAATAT CAGGGTTCTT ACTTAAGCTA AGTACACCAT TGATTAATTC TGTTAAGTTA 8400 

TGTGGTGGAA TATTOGTTGC CATACCTACC GOQATACCTG ATGCACCATT GGCTAATAAG 8460 
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AAATCTATTG TATCTTTATT AATATCAOGT AACAGTTCAA GTGTGATTTT AGTCATACGC 85B0 

GCTTCAGTAT AACGCATTGC TGCTGCGCCA TCTCCATCCA TTGAACCAAA GTTACCTTGG 8640 

CCATCAACAA GCGGATAACG ATAACTGAAA TCTTGAGCCA TACGTACCAT TGCTTCATAA 8700 

ATAGATGAGT CACCATGAGG GTGATATTTA CCCATTACGT CACCAACGAT ACGTGCTGAT 8760 



10 



IS 



20 



25 



45 



rrrnATATG atttatccgg 


TGTCATACCT 


TGTTCATTTA 


ATCCATATAG 


TATACGACGA 


8820 


TGTACTGGTT TTAAACCGTC 


ACGAACATCT 


GGCAATGCAC 


GAGCAACGAT 


AACACTCATC 


8880 


GCATAATCTA AAAATGATTC 


ACGCATTTCA 


CTGGTAATAT 


TTCGTTCATT 


TATTCTTGAT 


8940 


7GAGGTAATT CAGCCATCAA 


GAGTTCCTCC 


TTCAAAAGTT 


CAGTTCACAG 


CGCTTAGAAG 


9000 


TCTAAGTTT6 CATAAACTGC 


ATTATCTTCT 


ATAAATTGTC 


TACGGTTTTC 


TACAACGTCA 


9060 


CCCATTAACA TTTCAAATGT 


TTGGTCCGCT 


TCAATCGCAT 


CTTCAAGTTT 


TACTTGTAAA 


9120 


AGAGCXSCGGT GCTCAGGGTT 


CATTGTTGTT 


TCCCAtAATT 


GATCTGCATT 


CATTTCTCCA 


9180 


AGACCTTTGT ATCGTGCAAT 


AGACCATTTT 


GGTGTTGGAT 


TCAATTCA6A 


TTTAAGTTTA 


9240 


TCAAGTTCCC TATCATTGTA 


TACATAATAC 


TTTTGTTTAC 


CTTGTGTCAG 


TTTATACAAC 


9300 


GGTGGCTGTG CAATATACAC 


ATAGCCTGCT 


TCAATTAAC6 


GTCTCATAAA 


TCGATAGAAG 


9360 


AATGTTAATA ACAATGTTCT 


AATATGCGCT 


CCATCCACAT 


CGGCATCAGT 


CATAATGACG 


9420 


ATTTTGTGAT ATCTTGCTTT 


CGCTAGATCA 


AAGTCGCCAC 


CGATTCCTGT 


ACCAAATGCT 


9480 


GTGATCATTT GACXy^TTTC 


ATTGTTATTC 


AAAATTCTAT 


CTAATCGTGC 


TTTTTCAACA 


9540 


TTTAATATCT TACCTCGTAA 


TGGTAAAATC 


GCCTGCGTTC 


TAGAGTCACG 


ACCAGATTTT 


9600 


GTAGACCCCC CGGCAGAGTC 


CCCTTCGACT 




CACATTCTTC 


AGGACTTTTA 


a e ^ f\ 
9660 


CTAGAGCAAT CGGCTAATTT 


ACCTGGAAGG 


CTTGCTACAT 


CTAACGCTGA 


TTTACGACGT 


9720 


GTTACTTCAC GCX5CTTTTTT CGCAGCAACA 


CGTGCACGTG 


CCGCCATAAT 


ACCTTTTTCA 


9780 


ACCACTGTAC GTGCGACTTG 


TGGATTTTCA 


TATAAAAATC 


GTTCAAAGTG 


CTCTGAGAAT 


9840 


AATTTATCTA CAACTTGACG 


CACTTCAGAA 


TTACCTAATT 


TTGTCTTCGT 


TTGACCTTCG 


9900 


AATTGAGGAT CACCATGTTT 


GATAGATATA 


ATTGCTGTCA 


TACCTTCACG 


TGTATCTTCA 


9960 


CCAGAAAGTC TATCTTTTTC 


TTCTTTCATA 


ATCTTGCTAC 


TTAAACCATA 


ACTATTTAAG 


10020 


ACACX3CGTTA ATGCACGTTT 


GAATCCGTCT 


TCATGCX3TAC 


CACCTTCATA 


CGTATGAATG 


10080 


TTATTTGCX3T AAGTTAAAAG 


ATTTGTGGCA 


TATCCT6AGT 


TATATTGAAT 


CGCAATTTCT 


10140 



ACTTCAATAT CATCTTTAGA TTGATGAATA TAAATTGGCT CATCATGAAT AGGTTCTTTA 10200 

50 

mTCGTTCA ATAACTCAAC GTACGATTTA ATACCGCCCT CATAGTGATA GGAGTCTTCT 10260 
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GCAAGCTCTC TAATACGCTG CTGTAATGTT TCATAGTTGT ATACAGTTGT CTCTGTGAAG 103 80 

ATTTCTCCAT CTGCTTTAAA ACXSAAtGaCA GTACCTGTCT TAtCAGTnGT GCCAACTTCT 10440 

5 

TTTAAGTCAA ATTGAGGTAC ACCTTTTTTA TATGCTTGAT GATATATAGT CTCATTTCTG 10500 

TGTACATATA CTTCTAAGTC TTGTGACAAT GCGTTTACAA CTGATGAACC AACACCATGT 10560 

AAACCACCAG ATACTTTGTA TCCGCCACCG CCAAATTTAC CACCAGCATG TAAAACAGTT 10620 

10 

AAAATAACTT CGACAGCTGG ACGTCCCATT TTTTCTTGAA TATCAACTGG GATACCACGT 10680 

CCGTTATCCG TTACTTTAAT CCAGTTATCT TTTTCAATAA CAACTTCAAT TTGATTTGCA .10740 

TAACCaGCTA ATGCTTCATC GATACTATTA TCGACAATTT CCCACACTAA ATQGTGCAAA 10800 

CCTCTCTCTG AAGTCGATCC TATATACATA CCTGOTCTTT TACGTACTGC TTCTAAACCT 10860 

TCTAATACTT GTATTTGCCC AGCACCATAA TTATCCGTGT TGTTTACATC TGACAATGCA 10920 

20 GTCACCATCG CTTTCTGrTA CTTTATAATT TCACCTTGAT TAATACGATA CAATTTAGCG 10980 

TTATTCATGA TTTCATGATC AATACCATCT ACAGATGTCG TAGTGACAAA TGTTTGTACT 11040 

TTATGCTGAA TCGTACTTAA TAAATGCGTT TGACGCGAAT CATCTAATTC ACTGAGTACA 11100 

2S TCGTPTAATA ATAAGATGGG ATATTCCCCA ACTTC6ATAT TCATTAACTC AATTTCAGCT 11160 

AATTTAATGG ACAAAGCCGT TGTACGTTGC TGTCCTTGAG AACCATATGT TTGAGCATCC 11220 

ATGCCATTCA CATCAAAACT TATATCATCT CGATGTGGTC CGAATAAGCT AATGCCTCGT 11280 

30 TCTTTTTCTC TTTGCATATT ATCGCTAAGA ATAGACATAA TTTCTTCAAG TCGTGCCGCT 11340 

TCATTTTGAG CATAATCAAA TTTAAGACTA GGTAAATAAT TCAGCGACAA CGCTTCTTTA 114 00 

TCATTTGTGA TACCAGCATG AATCGGTTTA GCTAACGACT CTAGCTCTTG AATAAAATGT 11460 

^ GCACGTTTAT CAGTTACTTT CATTGCATAT TCAGCAAACT GCTGATTTAA TACTTCCAAC 11520 

ATTGTTAAGT CCTTTTTTTG GCCTAATTGT AACTGCTTTA AGTAATTATT CTTTTGCTTT 1158 0 

AAAATACX3TT GGTATTGAGC TAAATCATTT AAGTAAACAG CAGAAATTTG GCCCAACTCC 1164 0 

40 

ATATCTATAA AGCGTCGTCT TATTtGrGGr GAGCCTTTTA CAATATTCAA ATCTTCTGGC 11700 

GCAAATAGAA CCACATTGAG GTGTCCAATA TATTGAGTTA GACGACTTTG CTCTAAGTGn 11760 

ATTCACTTTG GACTTGTTTA CCTTTnTTAG TTATAAACAT TGTTAATGGG CATCGTGCCG 11820 

45 

TGT 11823 
(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 692 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

SS 



728 



EP 0 786 519 A2 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

ATAATTATTA ACATGGTGTG TTTAGAAGTT ATCCACGGCT GTTATTTTTG TGTATAACTT 60 

AAAAATTTAA GAAAGATGGA GTAAATTTAT GTCGGAAAAA GAAATTTGGG AAAAAGTGCT 120 

TGAAATTGCr CAAGAAAAAT TATCAGCTGT AAGTTACTCA ACTTTCCTAA AAGATACTGA 180 

GCTTTACACG ATTAAAGATG GTGAAGCTAT CX3TATTATCG AGTATTCCTT TTAATGCAAA 240 

TTGGTTAAAT CAACAATATG CTGAAATTAT CCAAGCAATC TTATTTGATG TTGTAGGCTA 300 

TGAAGTTAAA CCTCACTTTA TTACTACTGA AGAATTAGCA AATTATAGTA ATAATGAAAC 360 

,5 TGCTACTCCA AAAGAAACAA CAAAACCTTC TACTGAAACA ACTGAGGATA ATCATGTGCT 420 

TGGTAGAGAG CAATTCAATG CCCATAACAC ATTTGACACT TTTGTAATCG GACCCGGTAA 480 

CCGCTTTCCA CATGCAGCGA 6TTTAGCTGT GGCCQAAGCA CCAGCCAAAG CGTACAATCC 540 

20 irtTTATTTATC TATGGAGGTG TTGGtTTAGG aAAAACCCAT TTAAT6CATG CCATTGGTCA 600 

TCATGTTTTA GATAATAATC CAGATGCCAA AGTGATTTAC ACATCAAGTG AAAAATTCAC 660 

AAATGAATTT ATTAAATCAA TTOGTGATAA nA 692 
2^ (2) INFORMATION FOR SEQ ID NO: 138; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

ATACTGTAGC GCAAATTTCA CAATGGCATG TTATAGAAGA TTTAGTTACG AATGAATTAG 60 

GTATTAGTAT TTTACCAACA TCAATTTCAG AgCAACTAAA TGGAGATGTG AAGCTGtACG 120 

CATTGAAGAT GCTCATGTAC ATTGGGAATT AGGTGTTGTT TGGAAGAAGG ATAAACAATT 180 

AAGTCATGCC ACAACGAAAT GGATAGAATT TTTGAAAGAC CGTTTAGGCT AACATATTAA 240 

TAAAGCACTC ATTATTTAAG GCGCATCATT ACGTGGGTCA TTGAAATAAT GAGTGTTTTT 300 

TTGTGAAAAT GAAGTGAAAT TTAGAGAGCG TTTCCATAGA AAATAGTAAT ACAAACTATA 360 

AAAAAAGAGT ATTTTTATAT TGTGTACGCC ATCTTTATAA TAGTTATTGT AACAATTTAG 420 

ACATATTTAG AAAGGGATGG CGCCATGCAC AAAGTCCAAT TAATAATCAA ACTACTACTA 480 

SO CAACTAGGAA TCATCATTGT GATTACTTAT ATTGGCACAG AAATTCAAAA GATTTTTCAT 540 
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ATTGTACOGC 


TAACTTGGGT AGAAGACGGT GCAAACTTTT TATTAAAGAC GATGGTCTTT 


660 




TTCTTCATAC 


C6TCAGTTGT AGGtATTATG GaTGtgCTTC CGAAATTACG CTAAATTATA 


720 


5 


TACTCTTTTT 


CGCAGTCATT ATCATAGGAA CATGTATCGT TGCATTATCT TCAGGTTATA 


780 




TTGCTGAAAA AATGTCyGtT AAACwTAAAC ATCGTAAAGG TGTAGACGCt TATGAATGAT 


840 




TACGTGCAAG 


CCTTATTAAT GATTTTGTTG ACTGTCGTTT TATATTATTT CGCTAAAAGG 


900 


70 


TTACAACAAA 


AATATCCGAA CCCATTTTTG AATCCAGCAT TAATTGCATC TTTAGGAATT 


960 




ATTTTTGTCT 


TACTTATCTT TGGAATTAGT TATAACX3GGT ATATGAAAGG TGGCAGTTGG 


1020 


IS 


ATCAACCATA 


TTTTAAACGC AACGGTCGTA TGTTTAGCGT ACCCACTTTA TAAAAATAGA 


1080 


GAGAAAATTA AAGACAATGT CTCTATCATT TTTGCAAGTG TATTAAcTGG CGTCATGCTG 


1140 




AATTTCATGT 


TAGTGTTCTT AACACTTAAA GCATTTGGCT ATTCTAAAGA CGTCATTGTA 


1200 


20 


ACGTTATTGC 


CCCGATCTAT AACAGCCGCA GTAGGTATCG AAGTGTCACA TGAACTAGGT 


1260 


GGTACAGATA 


CGATQACCGT ACTTTTTATT ATCACAACGG GTTTAATCGG TAGTATTTTA 


1320 




GGrrCGATGT 


TATTAAGATT TGGAAGATTT GAATCTTCTA TCGCCAAAGG ATTAACX3TAT 


1380 


2S 


GGGAATGCGT 


CACATGCATT TGGCACAGCT AAAGCACTAG AAATGGATAT TGAATCCGGT 


1440 




GCATTTAGTT 


CAATTGGGAT GATTTTAACT GCAGTTATTA GTTCAGTGTT AATACCTGTT 


1500 




CTAATTTTAT 


TATTCTATTA ATTTAGATAT TTAAAATGAT AGACAGAAAG GGAGGCTATT 


1560 


30 


AGTAATAATG 


GCAAAAATAA AAGCAAATGA AGCATTAGTT AAAGCATTAC AAGCaTGGGA 


1620 




TATAGATCAC 


TTGTATGGTA TTCCAGGAGA CTCAATCGAC GCATAGTCGA TAgTTTACGT 


1680 




ACAGTGAGAG ATCAATTTAA ATTTTATCAT GTACGTCATG AAGAAGTAGC AAGCTTAGCG 


1740 


35 


GCTGCTGGTT 


ACACAAAATT AACTGGTAAA ATCGGTGTGG CATTAAGTAT CGGTGGCCCT 


1800 




GGTTTAATTC 


ATTTATTAAA TGGTATGTAT GATGCCAAAA TGGATAATGT ACCGCAATTA 


1860 




ATATTATCTG 


GACAAACGAA TAGTACAGCA CTTGGAACGA AAGCATTCCA AGAAACAAAT 


1920 


40 


TTACAAAAAT 


TATGTGAAGA TGTAGCCGTT TATAATCACC AAATTGAAAA AGGTGACAAT 


1980 




GTGTTTGAAA TCGTTAACGA AGCAATTCGT ACGGCATATG AACAAAAAGG TGTAGCTGTT 


2040 




GTTATTTGTC 


CTAACGACTT ATTAACTGAA AAAATTAAAG ATACAACGAA TAAACCAGTA 


2100 


45 


GATACATCAA 


GACCAACAGT AGTATCACCA AAATATAAAG ACATCAAA/^ AGCGGTTAAA 


2160 




CTAATTAATA 


AAAGTAAAAA GCCTGTCATG TTAATTGGTG TAGGTGCGAA ACATGCGAAA 


2220 




6ATGAGCTAC 


GTGAATTTAT TGAAATGGCT AAAATTCCTG TCATTCATTC ATTACCAGCT 


2280 


SO 


AAAACAATCT 


TGCCGGATGA TCATCCATAT AGTATCGGtA ACTTAGGTAA AATCGGTACC 


2340 
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CCATATGTGG ATTACTTACC TAAGAAAAAT 
AAAAATATCG GACATCGTTT CAATATTAAT 

5 TTGCATCAGT TAACTGAAAA TATTAAACAT 

TTAGAACGTA AAGCGGTTTG GGATAAATGO 
CCATTACGTC CAGAACGATT AATGGCATCA 

70 ATTTCA6CAG ATGTAGGTAC AGCAACAGTT 
AATAACAAGT TCATCATTTC AAGTTGGTTA 
ATTGCATCAA AAATTGCATA TCCAAATAGA 
TTCCAAATGG TAATGCAAGA CTTCGCTACA 
TTTGTACTTA ATAACAAACA GTTAGCATTT 
TTAGAATATG GAGTTGATTT TTCTGATATG 

20 

GGTAAAGGTT ATACAATTAA GAGTGCTAGC 
GCACAAGATG TACCAACGAT TGTAGATGTA 
GGTAAAATTG TAAATGAAGA AGCGCTTGGT 

25 

GAAGATAAAC ATTTAGATTT AQATCAAATT 
TTATAACTGA TTTAAAGGTT ATCACAATTG 
TCAACAAAAT GGGAATTGCC GTTTTGTTTA 

30 

ATAAAATTGT GAAAAAGTTG TTGAAAACGC 
AGATATCACT TGCGTGTTAC TGGTAATGCA 

^ AGTCTTGTTT GTTCATGCCT GCTTTTTTTG 
GTTTfiTATGT TTAAGAAATT GTTTGGACAA 
CCTGtTGCGA TTTTACCAGC AGCTGGTATT 

40 GAACAATTAG TAGAAATTGC ACCATGGTTA 
GTCATGGAA6 CAGCAGGACA AGTTGTATTT 
ACAGCACTTG GATTAGCAGG AGGAGACGGT 

45 TTAATTATGA ATGCAACAAT GGGGAAAGTG 
TATGCCAAAG GGGCAAAAGA ATTAAGTCAA 
TTAGGTATTC CAACGTTACA AACGGGTGTG 

^ GCATGGTGTT ACAACAAATT TTATAATATT 
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ATTAAAGCCA TTCAAATTGA CACAAATCCT 2460 

GTAGGAATTG TTGGAGATAG TAAAATTGCG 2520 

GTTGCTGAAA GACCATTCTT AAACAAAACXI 25B0 

ATGGAACAAG ATAAAAATAA TAATAGTAAA 2640 

ATCAATAAAT TTATTAAAGA TGATGCAGTG 2700 

TGGTCAACTC GATACTTAAA CCTTGGTGTA 2760 

GGTACAATGG GTTGCGGTCT TCCAGGTGCA 2820 

CAAGCCATCG CAATTGCTGG TGACGGTGCA 2880 

GCAGTACAAT ATGATTTACC TTTAACTGTA 2940 

ATTAAATATG AACAACAAGC AGCTGGTGAA 3000 

GATCATGCAA AATTTGCTGA GGCAGCAGGT 3060 

GAAGTAGATG CTATAGTCGA AGAGGCATTA 3120 

TATGTTGATC CTAATGCTGC GCCATTACCA 3180 

TATGGTAAGT GGGCATTTAG ATCAATTACT 3240 

CCACCAATTT CAGTGGCAGC AAAACGTTTC 3300 

AATTGAACTA TAAAAACGGT AATTTCTATT 3360 

TTTATCACAA ATGATCX5TAC TGAATTGATG 3420 

TTTTACAAAT ATGTATAATA GCTATGAATT 3480 

GGCATGAGCA AACAACCGCA CTATGAGAAT 3540 

TACATGGAAG CGGAAATTGA GATAGGGGAT 3600 

TTGCAACGTA TCGGTAAAGC ATTAATGTTA 3660 

TTATTAGCGT TTGGTAACGC AATGCACAAC 3720 

AAAAACGATA TCATTGTAAT GATTTCGTCG 3780 

GATAACTTGC CATTATTATT TGCAGTTGGT 3840 

GTTGCAGCAT TAGCAGCGCT AGTAGGTTAC 3900 

TTGCACATTA CAATTGATGA CATTTTCTCA 3960 

GCAGCGAAAG AACCAGCACA TGCTTTAGTA 4020 

TTTGGTGGTA TTATCATGGG TGCTTTAGCC 4080 

ACACTACCAC CATTTTTAGG ATTCTTTGCA 4140 
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AGCTTTGCGT OGCCACCAAT TCAAGATGGA TTAAATAGTT TATCGAATTT CTTATTAAAT 4260 

AAAAATTTAA CATTAACAAC GTTTATATTC GGTATTATTG AACGCTCATT AATTCCATTT 4320 

5 GGTTTACATC ATATTTTCTA TTCACCGTTC TGGTTTGAAT TCGGAAGTTA TACAAATCAC 4380 

GCAGGTGAAT TGGTTCGTGG TGACCAACGT ATTTGGATGG CACAATTGAA AGATGGCGTA 4440 

CCATTTACTG CTGGTGCATT TACTACTGGT AAATATCCAT TTATGATGTT TGGTTTACCA 4500 

GCGGCGGCAT TTGCTATTTA TAAAAATGCA CGACCAGAAC GTAAAAAAGT CGTGGGTGGT 4560 

TTAATGTTAT CAGplGGATT AACTGCATTT TTAACTGGTA TCACTGAGCC ATTAGAATTT 4620 

TCATTCTTAT TTGTAGCACC AGTACTTTAT GGAATTCACG TATTATTAGC TGGTACATCA 4680 

IS 

TTCTTAGTAA TGCATTTATT AGGCGTTAAA ATTGGTATGA CATTCTCAGG TGGTTTCATA 474 0 

GATTATATTT TATATGGTTT ATTAAACTGG GATCGTTCAC ACGCATTATT AGTTATTCCA 4800 

GTCGGTATTG TATATGCTAT CGTGTATTAC TTCTTATTCG ACTTTGCAAT TCGTAAGTTT 4860 

20 

AAATTGAAAA CACCAGGTCG TGAAGATGAA GAAACTGAAA TTCGTAACTC TAGTGTCGCA 4920 

AAATTACCAT TTGATGTCTT AGATGCAATG GGTGGAAAAG AAAACATTAA ACATTTAGAT 4980 

GCATQTATTA CACGTCTACG CGTAGAAGTG GTTGATAAAT CAAAAGTAGA TGTAGCAGGT 5040 

25 

ATTAAAGCTT TAGGCGCATC AGGTGTATTA GAAGTTGGAA ACAATATGCA AGCTATCTTT 5100 

GGTCCAAAAT CAGATCAAAT TAAACATGAT ATGGCCAAGA TTATGAGTGG T6AAATTACG 5160 

AAACCAAGTG AAACGACAGT GACTGAAGAA ATGTCAGATG AACCAGTTCA CGTAGAAGCA 5220 

30 

CTTGGAACAA CAGACATCTA TGCACCAGGT ATCGGTCAAA TCATTCCATT ATCAGAAGTA 5280 

CCTGATCAAG TATTCGCTGG TAAAATGATG GGTGATGGTG TTGGCTTTAT CCCTGAAAAA 5340 

GGTGAAATTG TAGCACCGTT TGATGGTACA GTGAAAACAA TCTTCCCTAC GAAACATGCG 5400 

ATA(JE3ATTAG AATCTGAAAG TGGCGTCGAA GTACTTATTC ATATTGGTAT CGATACAGTG 5460 

AAACTGAATG GTGAAGGATT CGAAAGTCTG ATTAACGTTG ATGAAAAAGT AACACAAGGT 5520 

40 CAACCATTAA TGAAAGTGAA TTTAGCATAC TTGAAAGCAC ACGCACCAAG CATCGTTACA 5580 

CCAATGATTA TTACAAATCT TGAAAATAAA GAACTTGTCA TTGAAGATGT ACAAGATGCT 5640 

GATCCAGGTA AGCTAATTAT GACAGTCAAA TAATGATTAA AAATGAAACA GCATATCAAA 5700 

45 TGAATGAACT TTTAGTCATT CGTAGTGCGT ATGCGAAGTA GCGAGTTGAA AGAGAATACG 5760 

TTACAAAAGG CAGTAGCTTA AAATGAAGCT ACTGCCTTTT TA6TGCGCAA TGATGTATAG 5820 

CAGGTGTGTT GATGrTAATA AGTTAAATAT TAGTGTTAGA TATAGAAAAC ATTGCTTATG 5880 

SO TTTTTGTCAC ATTTTAGAAA AATGCATCTT CGCGACTAGC CAAATTAATA GTCTCATTGA 5940 
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AATAAATTAA CATGATTTTA AATCTATTTG TAAGATAAGG AGATTTGTCA TTATGACAAC 6060 

AGAAGGTCTA TTAGTTGCAG AGAAAGAAAT CGAAGTGAAT GGTTACX3ACA TTGATGCGAT 6120 

5 GGGTGTC6TT AGTAATATCG TTTATATTAG ATGGTTCGAA GATTTGAGAA CAGCGTTTAT 6180 

TAATCAGCAC ATGAATTACT CAACAATGAT CAATCAAGGC ATTTCACCTA TACTTATGAA 6240 

AACGGAAGCA GAGTATAAAG TACCTGTCAC AATACATGAC AAACCAGTAG GTCGTATTTA 6300 

1^ CTTAGTTAAA GCAAGCAAGA TGAAATGGGT GTTTCAGTTT GAAATTGTGT CCGCACATGG 6360 

CGTGCATTGT ATTGGTACAC AGACAGGCGG TTTTTACAGA TTGAGTGATA AGAAGATAAC 6420 

CTCTGTGCCA CAAGTGTTTC AAGACATTTT AGCAACAAAA TAATGACTTC ATTTTAAAAT 6480 

IS 

ATAAAAAGTA AGAAGGTGTT CGAAATGGTT AAGCAATTAA ATAGTGTCGA AGCATTCCGT 654 0 

GAATTTATTC ATCAATATCC GTTAGCAGTT GTACATGTCA TGCGCGATCA GTGTAGCGTG 6600 

TGTCATGCCG TTTTACCACA AATTGAAGAC TTGATGCAAT CATATCCCAA TGTGCCATTA 6660 

20 

GCTGTGATTA ATCAAAGTCA GGTGGAAGCT ATTGCTGGAG AATTAAATAT TTTCaCTGTA 6720 

CCTGTGGATT TAATTTTTAT GAATGGAAAA GAAATGCATC GTCAAGGGCG TTTTATCGAT 6780 

ATGCAACGTT TTGAACATCA TCTTAAGCAA ATGAATGATA GTGTAAATAA CGAT6TCGAT 6840 

25 

GAGCATTAAT ATCGCAAATG ATTAGCATTG CTAAGATTAT GTAGACATCA TAACTTATTT 6900 

CCCAGTAAAT ATTGGTAGTA ATTAGAATCA GCATGGTACA GTAGAACTAT AGTAGAAATC 6960 

ATCAAAGAGG AGTGACGACA AATGCGTAAA AAATGGTCTA CACTTGCGTT TGGATTTTTA 7020 

30 

GTTGCAGCAT ACX5CACATAT TAGAATTAAA GAAAAACGCA GTGTGAAAAG TTATATGTTA 7080 

GAACAAGGTA TACGATTATC TAGAGCTAAG CGTCGTTTTA TGTATAAAGA AGAAGCGATG 7140 

^ AAAGCATTAG AAAAAATGGC GCCACAGACA GCAGGCGAAT ATGAGGGAAC CAATTATCAG 7200 

TTTASGATGC CAGTAAAAGT GGATAAGCAC TTCGGTTCAA CCGTTTATAC CGTTAACGAT 7260 

AAACAAGATA AGCATCAACG CGTTGTATTA TATGCACATG GAG6CGCATG GTTCCAAGAC 7320 

40 CCACTCAAAA TTCATTTCGA ATTTATTGAT GAACTTGCAG AAACACTCAA TGCTAAAGTC 7380 

ATCATGCCAG TATATCCGAA GATTCCGCAT CAAGATTATC AAGOGAC3GTA TGTGCTTTTT 7440 

GAAAAGTTGT ACCATGATTT ATTGAATCAA GTAGCAGATT CTAAACAAAT CGTTGTAATG 7500 

^ GGTGACTCTG CGGGCGGTCA AATTGCTTTA TCATTTGCTC AATTGTTAAA AGAAAAACAT 7560 

ATTGTGCAAC CAGGACATAT TGTATTAATT TCACCAGTTT TAGATGCAAC GATGCAGCAT 7620 

CCTGAAATTC CTGACTACTT AAAGAAAGAC CCAATGGTAG GTGTGGATGG CaGTGTGTTC 7680 

cn 

TTAGCTGAAC AATGGGCAGG GGACACACCT TTAGATAACT ACAAAGTATC ACCAATTAAT 7740 
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10 



IS 



20 



2S 



30 



4S 



SO 



CCAGATGCTT TGAACTTATC GCAATTGTTG AGTGCGAAAG GTATOGAACA TGACTTTATA 7860 
CCTGGATATT ACCAATTCCA TATTTATCCA GTATTTCCGA 7900 
(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

GTCTAAATAA ACAAAATTAT CATTGATTaC TGAACTGGCA TTTCGAAGTA ATGCTTCAAT 60 

ATCATTCGAA TATTTCTTCA ATTTATGATT GTGAAATAAT TCTTGCATCA AAAATGGTCT 120 

TTGGTCACAT GAATGTGCAT CTGAAGCTAC AAAATGAGCC AAATTACATT CTATAAATTG 180 

TAATGATAAC TTTTGAATGT TTTTACCAAA TCCACCAACT AAAGAACTCG ATGTTAATTG 240 

ACTCAGTGCC CCATTTGCAA CCAATTCATA TAATATTTCC GGATTTTTGG CGATACTTCT 300 

ATTTCTTTCA GGATGTGCAA TGATTGGTAT GTAACCTCTC GATTGTATTT CAAAAAACAA 360 

TTGTTTTGTA TAATGTGGTA CTTCGCCCGT TGGAAATTCA ATTAATAAAT ATTTCGAACG 420 

ATTAATACCT TGAATACTAC CATTATCTAA GCCTTTCAGA ATCGAATCTG TAATTCTAAT 4 80 

TTCTTGCCCG GGAAATAATT TAATATCCAA TGCTTGAACT TCTGGATGCG TTCTTAACTC 540 

CGCCAATTTC ACAAGCACTT GTTGAAATGT ATTATCATAT CTCGGATGCA AATGATGAGG 600 

TGTCGCTACA ATACTTGTTA CACCTTCATC CTTAGCTTGC TTTAATAGTG CAATACTCTT 660 

35 TTCAATTGTT TTAGGACCAT CATCTATATC AACTAATATA TGGTTATGAA TATCAATCAT 720 

GATTCATCAG TCCCATAATA TGCATAGTAA CTAGCACTTT TATCTTTAGG CATTCTATTT 780 

AAGACTACAC CTAATAATTT AGCACCTGTT GCTTCAATAA GTTCTTTTCC TTTTTTAACT 840 

40 TCATCTCTAT TATTATTTTC CGAATTAACT ACGTAGACAA CATTGCCGGT AAACTTTQAA 900 

AATAATTGCG CATCTGTAAC TGTGTTCACT GGTGGCGTAT CGATAATTAC AAAGTTATAA 960 

TTCATCAATA ATGTGTCATA CAAATTTGCA AATGCCCTTG ATGTAATTAA CTCTGACGGA 1020 

TTCGGTGGGA TTGGCCCAGA CGTCAAGACG TCTAAATCTT GAATTTCAGT TGAGATAATA 1080 

CTGTCTTGAT AAGTTGACCA ATTTAGCAAT AAACTTGATA GGCCTTCATT GTTTGGCAAA 1140 

TTAAAAATAT AATGCTGCGT AGGTTTACGC ATATCCCCGT CTACGATTAG TGTTTTATAA 1200 

CCTGCTTGCG CATATGCAAC TGCTAAATTT GCTGCAATTG TAGACTTACC TGCGCCTGGT 1260 
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10 



15 



20 



GATCTTATGC CTCGAAATTT CTCX3CTAATA GGTGACTTTG 6TTGTTCATG GACAATTAAA 1380 

CTTGATGTAC TTCyTCGTGT ATTCGTCATG GTAATTCCTC GTAAATTAAA ATTTTTGTAT 1440 

TGAACCTAAA ATAGGTAATC CTAGTTGCGA TTCAACATCT TCTTCTGTCT TAATACGCTT 1500 

ATCTAATAAT TCTTTTAAGA AAATAATCAA TATTGCTAAA ACAATACCAA CAATAATGCT 1560 

GATAACTAAG TTGACAGATA CTATTGGAGA TACTTTTACA GCATTATCAT GTGCTGAGGA 1620 

AAGTATCGTA ACATTATCAA CACTCATAAT TTTAGGCATG TCATGAGCAA AAACTTTAGA 1680 

TATnTATTA ACAATTTTGT CAGATTCAGA TTTATTCCCA GTGGTAACTG ATACAGTAAT 1740 

AATTTGAGAG TTTGTTTGAT TGGTTACTTT TAAAAATGAA TTCAACTCAG CTGTTGAATA 1800 

CTGACCATCA AnTTCTCTAG ATACTTTATC TASAATTCTA GGACTTTTGA TAATTTCCGT 1860 

ATATGTATTA ACAGACTGCA AACTACTTTG AACATTTTGG AAAGCTAAAT CACTTGAGGA 1920 

CTTTTTCATG TTCACTAATA TTTGAGTAGA AGCAGTATAT TTGTCAGGCA TAACAAAAAA 1980 

GGTT 1984 
(2) INFORMATION FOR SEQ ID NO: 140: 

25 • (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

CAAATCCCTT GGTGATGAtA AAtGtATTGC TGTGTAGCCA AATAATCTTC GTATATATGA 60 

35 CTGACGTTCA ACAACAGCTT GCAATCGTTT CGTTOGTACA GTTACTTTCT TCTTGTTAAA 120 

GAGACCATAT TCAATTTTAA GTTGCTCATT TTCAAGCATC ACCGAAAAGC CATAAAATCT 180 

TATCATTGTT ATAATCGTTC CAATAATATA TGCCACTATT AATACTAGTA AAATQATGAT 24 0 

40 TAATACTGAA ATACTTACAA TTTGAACCCA TTGACTAATT TCATGATTTA GCTTCGACCA 300 

TGGGATCAAC TCTCTTACAG CCCCGTAAAT CGGTACTAAA GCTGCTAACG TTACACCAAT 360 

GGCGCCACTG GTCATTGCCA TAAATAGTGA TTCTTTAAAA TTCATCTGAT ATATAGGAAT 420 

GCGTTTATTT TTCTGATTAA GCATACTATC AGTGTTCTGC ACTTCATCTA AGCGACCTTC 480 

TGCGATGTCT TCCACATTAC CTTCAATGTC ATGATTACAG TTGTCATTCT TCTCAGCACT 540 

AGACTTTTGC GCCACTTCTG TCTTCAACTC TGTTTGCAAT TGATCAATAT ATCGTTCAAG 600 

ATATTCACCT TGTTTTTTCG AAATAACACT TAAGACAATA CCATCACTTG GTGTTTTGAT 660 



45 



SO 
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AATACGTTTT ATATTTAATT CTTTACGCTT TTTATTAAAA ATACCTGTTG TTAAAATGAA 780 

ATAATTATCC tCAATCCAAT ATCGCGTGTT CATAATTCCG ACAATTTGAG AAATGTATGA 84 0 

5 

TATTAAAAAG AATACAAATA CAATACCTAT CCATAAATAT GATTCGGGAT TCGTATAATC 900 

AAAATCTTTC AATTGAAAGA TAATGAAAAT AAAAAAGACG ACTATGTTTT GTTTGATAGC 960 

ATTGATTATG CCATTAAAAT ATGAAATCGG ATGTAATTTT TGAGGTTCAG ACATCACTTT 1020 

10 

CAACCCCTCT CAAATTCGAC ATAGTTCTCT CTTCGATTAT TTTAACATCX3 TCATGAGACA 1080 

TCATCGGTAA ATAAATAGTA TGACCTGCAG TCATAAATCC AACTTTATAC AAATTAAGCA 1140 

,5 CTTTACTAAT TGGATTAGAT TTAATCQACA AGTATTGTAA ACX3TTCAATT CGACTCGTTT 1200 

CTTCTTTATA TATAAAAAAT GATGTACGAT ATTGTACACT TAGTTGATCA ACTTTATAAA 1260 

AGOGACAATG ATATTGCCAT AAAGGCTTAA TAAATAATTT TAATGTACTC AGAGCACCTA 1320 

2^ AAACCAACAA AATATAAAGT AAGTAATGTG GCCATTCAAA TCTTAACCAT ATAAAATAAA 1380 

AAATGACATA CACAGCTACA CTCAATATAA ATTCTAAGCC ATTCGTAATG TAGTAATACA 1440 

ACAATGCTGA CTTAGGACTC TTAGTCAACT TAGTATAATC TGACATATAC CCCTCTCCCC 1500 

25 

AAATAAAAAA TTATACGGAT TTATAATCTA TTTCATTTTA TTTTTATATG ATGATAATTA 1560 

TAGCATATGG AATATTTCAT GCTAATTTAT TCTTCCTAAA GGTACATCTA AAAATTTAAT 1620 

TAAGCAGAAA GTGCTTGAAT TGCTAAAAAG ACACCATGTT ATAATTTTAT CAACATGATG 1680 

30 

CCTTTCATCT ATAATCAATC TTTCATCTTA TCAAGAGCGA TATTTAGTTC AAGCACATTC 1740 

ACATAATCAT TTGTTAACAC ACCACGCTGC TTACGATGTT GAATCAAGTC GGCCACTCTT 1800 

35 GAAGTAGATA CATGACGAGC ATCAGCAATA CGAGGTGCTT GCTTCAATGC ATTTTCGACC 1860 

GTAATATGCG GATCTAAGCC CGACCCAGAA CTTGTTGCAG CATCTATTGT TACATTTGAA 1920 

1TCCCAAATT TAACATGATG TTTCATGCGT GCTATTAATT CGGTGTTTCC ATTCGATTCA 1980 

40 TTACTTCCAC CTGAAGATAC GCCX5TTTTTA TATAATTTTT CAGGATTCAT ATTATAATCA 2040 

ACTGCACTCG GTCTCCCGTG AAAATATCGT GTCTCTGTCC AGTGCTGTCC AATCAATTTT 2100 

GATCCAACTA TACGATTGTC ATACGTAATT AAACTGCCAT TTGCTTGTTG ATAAAAAAAT 2160 

45 

ATTTGACCAA TTAACGTGAT AGCTAACGGG AATAAAAATC CACATAATAC CATAGTTATT 2220 

ATCGTTAAAC AAATACTATT TCTTATCGTA TTCATGGTAC AGGCTCCTTC CTCTTTACAC 2280 

AAAAAATTGT ACAATCATAT CTATTAATTT AATGCCTAAA AACGGGACGA TTAATCCACC 2340 

SO 

TAATCCATAA ATCAACATAT TATTTATAAA GATTCTATCA ATGCTGTAAC CCTTTACTTT 2400 

TACACCTTTC ATGGCAATTG GAATTAAGGC AACAATGATT AATGCATTGA ATATCAAAGC 2460 
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AATTGTTGAC ATCATTAGTG CAGGTAAAAT TGCAAAGTAT TTTGCTACGT CATTAGCCAA 2580 

ACTAAATGTC GTTAATGCAC CTCTCGTCAT TAATAATTGT TTGCCTATTT TTACAACCTC 2640 

5 

TATTAACTTT GTAGGATTCG AATCTAAATC AATTAGATTA GCTGCCTCTT TAGCACTAAT 2700 

TGTCCCTGAG TTCATAGCTA ATCCTATATT CGCTTtGTGc tAGCGCAGGT GCATCATTTG 2760 

TACCATCTCC TGTCATCGCA ACAATATGGC CTTTOGCTTG TTCATCTTTG ATGACTTTAA 2820 

70 

TTTTATCTTC GGGTTTACAC TCTGCAACAA ATCTATCAAC CCCGGCTTCT TTTGCAATTG 28B0 

TAGCTGCTGT TAAAGCATTA TCACCTGTAC ACATAACTGT TTCAATCCCC ATTTTTCTCA 2940 

75 ATTCAGTAAA TCGTTCTACA AGACCATCTT TAATCACATC TTTTAAATAA ATCACGCCAA 3000 

GCATGACATT GTTTTCAATG ACTATTAAtG GnGTGCCACC TTTACTOSAT ACATCCATAC 3060 

AGAGAGACTC AATATTAAGA GGAATATTGC CTTGTTGTTG TTTGACAAGA TTTATCATAC 3120 

TATTAGGTGC ACCTTTGAAT ACCGATATTT CATTTGTAAT GATTCCGCTC ATTCTAGTTT 3180 

CAGCTGTAAA AGGCTTATAT GTGCCATCAA TGTCTTTAGG CAGCTCATTT ATATACATcT 324 0 

GcttCGCTAA TCGTACAATA CTTTTTCCTT CTGGCGTATC ATCGTAGATT GATGACATAT 3300 

AAGCAGCGAC TATCAATTTT TCAAGCATTT GTTGATTCAC TGGTAAAAAT TCACTAGCGA 3360 

TTCGATTGCC ATAAGTGATT GTGCCTGTCT TGTCTAAAAT CATTACATCG ACATCTCCAC 3420 

ATACTTCTAC AGCACGCCCA CTTTTCGCTA ATACATTGAA TTGAGTAACA CGATCCATGC 3480 

30 

CTGCAATACC AATCGCCGAT AACAAACCAC CGATTGTCX3T TGGTATTAAA CATACTGTTA 3540 

ACGCAATGAG CATCGCAATA GGTAAAATTA AATGCAGGTA AGATGCTATT GGATATAACG 3600 

35 TTACAATAAC GACTAAAAAT ATAATTGTTA ACOTTGTTAA TAATGTAAAA AGTGCAATTT 3660 

CATTTGGTGT TTTATTTCTT TCCGCCCCTT CAACTAAGGC AATCATTTTA TCTAAAAAAG 3720 

ATGIACnCGC TTCACTCTCA ACACGTATTT CTAACCAATC AGATGTTACA AGTGTACCGC 3780 

CAATGACTCC ATCAAAATCG CCACCTGATT CTTTTATCAC AGGTGCAGAC TCACCAGTAA 3840 

TTGCAGATTC ATCAACGGTT GCTAATCCAT TTATTACAAC GCCATCAGCA GGGATTGTTT 3 900 
CTCCATTTTC TACCCGAATA TTTTGTCCGG CTTTTAACTC TGTGGCGTTC ACTATCCGAT . 3960 

4S 

ACGCACCATT TTCTTCTATC AATCGAGCAG TTAAATTTGA TTGTGCTTGT CTTAAACTAT 4020 

CAGCTTGCGC TTTTCCACGA CCTTCAGCAA AGGCTTCTGA AAAATTAGCA AACAATATAG 4080 

TTATTAATAA TATGATAAAA ATTGTAATCA AATAACCTCG CGATAGATAG CTAGTTCCAA 4140 

50 

ATATGTCAGG AAAACATATT AATATCAACG TTAAAATCAT TCCAACCTCA ACGACAAACA 4200 

TTATCGGATT TTTTATTAAT TGTTTAAGAT TCAGCTTATA AAAACTCATT TTCAAAGCTT 4260 
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TTTATTTTAA AGTTA/U^T TCACCAATAG GACCAAGTAA TAGTACTGGA ATAAATGTCA 4380 

AACCACTTAG TAAAACGATA AATACGATTA GTGATACGCC AAAATAAGGT TTATCAATCG 4440 

5 

CTATTGTATA TTTATCTTGA TGGTATGATT TTTTATTCAC TAAACTTGAT GCAATCATTA 4500 

ATTGCAAAAT AATTGGTATA TAACGAGAAA GCAACATAAT GATTCCTGTA GAGATATTCC 4560 

AGAATGTTGT ATCATCTTTC AGTCCTTCAA ACCCTGATCC ATTGTTCGCA GCAGCTGATG 4620 

10 

TCATTTCATA CATAACTTGT GAAATACCAT GAAAAGACX^G ATTCGTtATa CTTtCACTTG 4680 

CTCCAGGAAT CATAAAAGCA AGTGCTGAAA ATACTAAAAT TAAAATTGGO TGTATGy^GAA 4740 

iS AGACTAAGAC AATACATTTC ATTTCACGGG CGCCAATTGG CATATTTAAA TATTCIGGTG 4800 

TTTTACCAAC CATCAAACTG CATATAAACA CCGTCAGTAA GACAAATATC AATAAATTCA 4860 

TGAGTCCTAC GCCTTCGCCA CCAAATACAA CATTTAGCAT CATTAATACC ATTGGTCCTA 4920 

ATCCACCTAT AGGCGTTAAG CTATCATGCA TGTTATTAAC AGAACCCGTT GTAAATGCCG 4980 

TCGTAATAAC TGTAAATAGT GCTGACAAAC CTGCTCCAAA CCGTACCTCT TTACCTTCCA 5040 

TATTCGGTCC ATAAATGCCT AAATTCGCTA GTATTGGATT ACCACGATAC TCACTCCACA 5100 

25 

TAGTTAATGT AAGAATTGCT ATAAAAATGA AAAACATTGC GACAAATAAT ATCAACGCAT 5160 

GACGATGTAC TCGTTTACCA TGTCTACTTA ACATGCGACC AAATAAGAAC AACATTGACA 5220 

TAGGAAGTAA CATCATACTG CCCATTTCTA TAAAATTGCT CCAAATATTT GGATTTTCAA 5280 

30 

AAGGTGTTGC AGAATTTCCT GCTAAAAATC CTCCACCATT CGTACCAAGA TGTTTTATTG 5340 

ATTCAAGTGA TGCAATAGGT CCAAATGCAA TATGTTGAAT ATGTCCGCTT AAAGTCCGAA 5400 

35 TCATTAAATT AGCATGCAAC GTTTGTGGTA CaCCTTGAGT CATCAATAAA ATACTAATTA 5460 

AACATGATAA TGGTAAAAGT ACTCGGACAA TAAACCGAAC AATATCTTGA TAAAAATTAC 5520 

CAATCATATT AGTTAATCCA GTTAAACGTC TCAACATCGC TATACAAACG GCGTAACCTG 5580 

ATGCACTAGA TGTAAACATT AAATATGTCA TTACAATCAT TTGCGTTAAA TATGTCACAT 5640 

CTGaTTCACC GTTATAGTGT TGtAAATTAC TATTTGTTAA AAAAGATATT GCTGTATTAA 5700 

ACGCTAAATC TATCGATTGG TTTAAATTAT GATTTGGATT TAAAAAAAGC CATTGCTGAA 5760 

45 

CTATTAGCAA TACAAATGTT ATAAACCCCA TAAATCCATT AAATGCCAGA AAATGTTTGA 5820 

CATATGTTTT AGCTGACATG TGTTCTAAAT CTGTGCCGAT AATTTTAAAA CACATATTTT 5880 

CAAATCTAGT AAATATTAAA TCTACTCTTG ACGATTGCAC CAATGCTACG CGATATAGAT 5940 

SO 

ATCCACTAAA AACATACGTA ATCATAACCA TCATTGTTAG AAACAAAATT ATTTCCATGA 6000 

TAACCCTCAC TTAATATATT TCTAAAATTT TTCACTACGA ATTAAGGCAT AAAATAAATA 6060 
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ACACAACAAC ATCGTAACAA CTTGTTTATG AGAGAAATnT TAATTTTCAA ACTTAGTTAT 6180 

TAAGAAAnCA TTAAGATGTG TATGCAGAAA TAAATTTTAT A6CATTTAAT TGTGAAGAAT 6240 

^ ATTATGATAT TGCTATCGAG GTGAAGGTTA TG 6272 

(2) INFORMATION FOR SEQ ID NO; 141: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1978 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

75 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

AAATGATGTT TTACAATAAA TATAnAAACG TATCAACATA TATCATCATA TTTTTAGTTT 60 

20 CAAGTGCAGC CTTTGCAATA TTCTTGTTAA QTGCGnACAT TAGTGCTCAC TCGGAACAAG 120 

TGTACGAAAT GACTGACCAT CAAATTAAGA ACAATAC6AT AAATAAAGCA TACGAACATA 180 

AAGACCCTAC AAACAATAGC GAACAAAGAG ATGGGAAAGT GTTCGCTTTA ATAAATTGAT 240 

ACATTGTCAC AACGTTATTT TGCCTATTTT TGCGmAATAG CGTTTTTTAT TACwTTTTTG 300 

CTGATSTTAA ATTTGTTATA TTTTGTTAAA GTATTATAAT GATTGAATAA ACAAATTGAA 360 

GGTAGGTTTT TTAATTGAGT AATTCTGATT TGAATATCGA AAGAATTAAC GAGTTAGCTA 420 

AAAAGAAAAA AGAAGTAGGA TTAACTCAAG AAGAAGCAAA GGAGCAAACA GCCTTAAGaA 4 80 

AAGCTTATCT TGAGAGTTTT AGAAAAGGGT TTAAACAACA AATTGaAAAT ACTAAAGTAA 540 

TTGATCCAGr AGGTAATGAT GTAACACCTG AAAAAATTAA AGAGATACAA CAAAAAAGAG 600 

ATAATAAAAA TTAAATCACA AATCTGTAAA GAATTTTCTG ACATTATAAC TTGAAATAAG 660 

TATTTTACTT ATCTTTTTAT TTTAAAATAA GTTATAATGT ATTTGATAAA ATTGAAGAAG 720 

40 GGAAGATACA CAAGATGTTT AATGAAAAAG ATCAATTAGC TOTTGATACG CTACGTGCAC 780 

TAAGTATCGA CACAATCGAA AAAGCGAATT CTGGTCATCC AGGATTACCT ATGGGAGCTG 840 

CCCCAATGGC TTACACTTTG TGGACACGTC ATCTGAATTT TAATCCACAA TCTAAAGATT 900 

^ ACTTCAATAG AGACCGTTTC GTATTATCTG CAGGGCATGG TTCAGCATTA TTGTATAGCT 960 

TGTTACATGT TTCTGGTAGT TTAGAATTAG AAGAATTAAA GCAATTTAGA CAATGGGGTT 1020 

CTAAAACACC AGGTCATCCT GAATACAGAC ATACAGATGG TGTAGAAGTT ACTACCGGAC 1080 

so 

CACTTGGACA AGGTTTTGCT ATGTCAGTAG GATTAGCTTT ACAGAAGATC ACCTAGCAGG 1140 

gAAATTTAAT AAAGAAGGAT ATAATGTTGT AGATCATTAC ACATATGTAT TAGCTtCTGA 1200 
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AAGTAAATTA GTTGTTTTAT ACGATTCAAA TGATATTTCA 
AGCTTTTTCT GAAAACACAA AAGCTCGTTT TGAAGCATAT 
* TAAAGATGGT AATGATTTAG AAGAAATTGA TAAAGCGATT 

AGGACCAACG ATTATTGAAG TTAAAACAAC AATCGGATTT 

AACTAATGGT GTTCATGGGG CACCTTTAGG TGAAGTTGAA 

10 

TTACGGTTTA GATCCTGAAA AACGTTTTAA TGTTTCAGAA 

AAATACTATG TTAAAACGTG CTAATGAAGA TGAATCTCAA 

ATATGCAGAA ACATATCCTG AATTAGCAGA AGAATTTAAA 

75 

GCCTAAAAAT TATAAGGATG AATTACCACX5 TTTTGAACTG 
TGCTGATTCT GGTACTGTTA TTCAAGCAAT CAGTAAAACT 
20 ATCAGCAGAC CTTGCTGGTT CAAACAAATC CAATGTAAAT 

T6AAACACCT GAAGGtAAAA ATGTGTGGTT TGGTGTAC6T 
(2) INFORMATION FOR SEQ ID NO: 142: 

25 . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7588 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 2: 



35 


TAGTAGTATT 


TATTAAATTA 


TACGAAGGGA 


CCCAAGACAG 


AAAATTCATT 


TTATTGAATT 


60 


TTACATTTAT 


GTGCCAAGTT 


GGGAAAAATG 


TCTTATTTTT 


TCaAAGTATT 


TAAAAGTAAA 


120 




ATTACATGTT 


AATACGTAGT 


ATTAATGGCG 


AGACTCCTGA 


GGGAGCAGTG 


CCAGTCGAAG 


180 


40 


ACCGAGGCTG 


AGACGGCACC 


CTAGGAAAGC 


GAAGCCATTC 


AATACGAAGT 


ATTGTATAAA 


240 




TAGAGAACAG 


CAGTAAGATA 


TTTTCTAATT 


GAAAATTATC 


TTACTGCTGT 


TTTTTAGGGA 


300 




TTTATGTCCC 


AACCTTTTTA 


GAATATTAAA 


TTTCTACAAT 


TTCGTCATCT 


TCAACAATAA 


360 


45 


AGCCCATTGT 


ATTGACGCTG 


TTATTTAAGA 


AAGTCAGAAT 


ATAACGCATT 


ACTTCATCAC 


420 




GTTCTGGCTC 


ATTGTGAACC 


TCGTGGTAAA 


AACCTTGCCA 


AGCTTTAAAA 


TATAATTCAG 


480 




GTGTTTGATA 


TTTTTCTTTA 


AACTCATCAA 


TTGCCCTAGT 


ATCAACAATT 


AAATCCTTCG 


540 


50 


TTCCATACAT 


TAATAGCGTT 


GGCATTGGTT 


GAATGTCATG 


AATATGAGCC 


ATCGTATCTT 


600 




TCATCGTCTC 


ATTAATTGTA 


TTATACCAAT 


GATACGTTGC 


TTTTTTTAAC 


ATTAAACCAT 


660 
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TTAGATGGCG AATTAAACAA 1320 

GGTTGGAATT ACTTACTAGT 1380 

ACTACAGCTA AATCTCAAGA 1440 

GGTTCACCGA ATAAAGCAGG 1500 

AGAAAATTAA CATTCGAAAA 1560 

GAGGTATACG AAATTTTCCA 1620 

TGGAATTCAT TATTAGAAAA 1680 

TTAGCGATTA GTGGTAAATT 1740 

GGTCATAATG GTGCATCTCG 1800 

GTCCCTTCAT TCTTTGGTGG 1860 

GATGCAACTG ATTATAGTTC 1920 

GAATTTGCTA TGGGTGCT 1978 
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CATTAAAACG TGTGTCTTTT GAAATTTTAC CTATATTTGA AACAAGTTTA TCTTTACGAT 780 

TTTTTCCATT CTTTTGAAGT TCTAGCATAG GAGAAATTAA CATCATCCCC TCGATTGGCA 840 

ATTCTACTTT TTCAAGTAAA TTTAATAAAA TCAAACCGCC AAGTCCTACC CCTAATACAT 900 

AAGTAGGAAT TTTATATTCA TTAGCTATCT TTAACCAGTC TAGCAAACTT TCGTGATACG 9S0 

TTTGAAAGTT TTCAATTTGT CCTTTATTAG CTCTTGAAGT TTGACCTTGA CCAGGCAAAT 1020 

CTCCCATAAT CACATGATAG CCATTTCTTC TTAACATCGT AATAACATAT GCATATCTTC 1080 

CCGTATGTTC TAATATATTA TGAGCAATAA CAACGACGCC TTTCGCATCA TTTTCAGCTT 1140 

CCCACTTCCA CATTATTATA CTGCCCCTTT TTCATTAATC TTCAATAACA TAATTATAGC 1200 

AAATTCACTA TGTAGATTTC TATTTATAGT ATTATTGTTG TCCATATTAT TATATATAAA 1260 

T6AAATCAAC ATCAATAATA GTGTAATTAT ACATAATTAT TTTTGATTGT TTTTGATGAA 1320 

AACGCTTTCT CGAATATTTT TTTCATGCTA AACTTATTGT AAACACAAGG GTTTOOAGGA 1380 

GTAGCAATGG CACTATTAAA GAATTTTTTT ATCGGATTAT CTAATAATAG TTTTTTAAAC 1440 

AACGCAGCAA AAAAAGTGGG CCCACGTTTG GGCGCCAATA AAGTC6TTGC CGGAAATACA 1500 

2S ATTCPAGAGT TAATTAATAC AATCGAATAC TTAAATGACA AGAATATCGC TGTTACGGTA 1560 

GACAATTTAG GGGAATTTGT CGGTACAGTT GAAGAAAGTA ATCATGCTAA AGAACAAATT 1620 

TTAACAATTA TGGACGCGCT TCATCAACAT GGCGTAAAGG CACATATGTC TGTTAAATTG 1680 

AGTCAGTTAG GTGCAGAATT CGACTTAGAA TTAGCTTACC AAAATTTAAG AGAGATTTTA 1740 

CTTAAAGCAA ATACTTACAA CAATATGCAT, ATAAATATTG ATACTGAAAA ATATGCTAGC 1800 

CTGCAACAAA TTGTTCAAGT TTTAGATCGC TTAAAAGGCG AATTTAGAAA TGTTGGTACT 1860 

GTAATTCAAG CATATTTATA CGATAGCCAC GAATTAGTTG ATAAGTACCA AGATTTACGA 1920 

TTAMTTTGG TTAAAGGTGC ATATAAAG7VA AACGAATCAA TTGCATTTCA ATCTAAGGAA 1980 

GACGTAGATG CAAATTACAT CAAAATAATT GAACAACGTT TGTTAAACX3C ACGCAATTTC 2040 

ACTTCAATTG CAACACATGA CCATCGCATC ATTAATCATG TAAAACAATT TATGAAAGAA 2100 

AATCACATTG AAAAAGATCG TATGGAATTC CAAATGCTCT ATGGTTTTAG ATCAGAGTTA 2160 

45 GCAGAAGAAA TCGCAAATGA AGGCTATAAT TTCACTATTT ATGTACCTTA TGGCGATGAT 2220 

TGGTTTGCGT ATTTTATGAG AAGATTAGCA GAACGCCCAC AAAACCTATC TCTTGCTGTA 2280 

AAAGAATTTG TGAAACCTGC TGGCTTAAAA CGTGTTGGCA TAATTGCAGC TTTAGGAGCT 2340 

ACAGTTATGT TAGGTTTAAG TACAATTAAA AAATTATGCC GTAAATAGAG CAAGACATAA 2400 

ACAATAATTT AGGAGTCTGG AACAATAATC AATGTTCTAG GCTCCTAAAT GTTATATTGG 2460 

55 



30 



35 



40 



741 



EP 0 786 519 A2 

TAGATTTTAA TAAATTAGCC ATTTCAATTG CACTTACTGC TGCTTCAGCA CCTTTATTGC 2580 

CAGCTTTCGT ACCTGCTCTT TCCACAGCTT GTTCaATAcT TTCAGTCGTT AAAATACCAA 2640 

5 ATATGACTGG TACATTAGTT TGATCATTCA CTTTAGAAAC ACCTTTCGCG ACTTCATTAC 2700 

AAACATAATC ATAATGAGAC GTAGCACCGC GAATTACGCA TCCTAATGTA ATTACTGCAT 2760 

CATAATTTCC TGATGAGGCT AATTTTTTAG CTACTAAAGG AATTTCAAAC GCACCTGGCA 2820 

10 

CAAATGCTAC ATCAATATTG TCTTCATTAA CATCATGTCG AATCAAAGTA TCTTTTGCAC 2880 

CTTCAAGTAA TCTTCCAGTG ATAAAATCAT TAAATCGACT AACTACGATT GCAACTTTCA 2940 

AATCTTTTCC AATTAATTTA CCTTCAAAAT TCATGTTAAA ATCCTCCTAT ATTAAATGAC 3000 

75 

CCATTTTTAT TTTTTTCGTT TCCATATAAT CATGATTATG TACCX3TTTCT GGTACGATAA 3060 

CTTCAATTCT TTCTGCAATA TCAATGCCAT ATTGTTTTAA TCCCTCAAAT TTACTTGGAT 3120 

2^ TATTACTTAA TAAATTGATA TGTTCGATGT TAAAATATTT TAAAATCTGT GCAGCAATAT 3180 

GATAATCTCG CAAATCTTCA TCAAAACCTA ATGCTAAATT TGCAGTTACT GTATCATATC 3240 

CTTGCTCAAT TAATTCATAT GCGCGTAATT TGTTTAACAA TCCTATGCCA CGACCTTCTT 3300 

25 GAGGTAGATA AATAATCATG CCACCATGTT CATTGATATA CTTCATAGAC GATTCAAGTT 3360 

GAGCACCACA ATCACAACGT TGACTATGGA AAATATCGCC TGTAAGgCAC GCAGAATGTA 3420 

AGCGTACATT TTCATGTTGT CGAATTGCAC CTTTTGTCAG TACAACTATC TCTTCATCTG 34 80 

on 

TGTATGTCGC TTTAAAACCA TACATATCAA ATGTTCCGAA ATCTGTAGGC ATTTTCACTT 3540 

TTGCCTTAAA TTCAATTTCT GGTTCTAATT TTTTACGATA TTCAATTAAA TCATCAATCG 3600 

TAATCATCTT TAATTGATGT TTTTCTTTAA ACTTTTGTAA ATCTTGTCCT TTCGCCATCG 3660 

35 

TGCCGTCATC ATTCATAATC TCACAAATGA CACCAGCGGG CTTGGCACCA GTAAGTTTAG 3720 

CTAAATCAAC AGCCGCTTCT GTGTGTCCAT TTCTAGCTAA TACGCCTTTA TCTTGTGCTA 3780 

CTAATGGAAA TAAATGACCA GGACX3ATTAA AATCTTTAGC TTCACTACTA GGATCAATGA 3840 

40 

GCTTTTTGGC AGTCAATGTA CGTTCATAAG CACTAATTCC TGTTGTTGTA TCTACATGAT 3900 

CAATACTCAC TGTAAATTGC GTACCAAAGA TGTCGGAGTT ATCATCAACC ATTTGTACCA 3960 

45 AATCCAAACG TTGTGCAATA TCTTTAGACA CTGGTGCGCA TATTAATCCC CtTGCTTCTT 4020 

TCGCCATAAA ATTAATGGTA TTATCGTTCA TCCATTCAGT AACCGCTACT AAATCACCTT 4080 

CATTTTCACG ATTCTCATCA TCTACTACAA TAATTGGTTC TCCATTTTTT AAAGCCATTA 4140 

^ AAGCACTGTC AATATTATCG AATTGCATGC TACCCCTCCt AAAAACCAAA TGCTCTTAAT 4200 

TTATCTACAG ATAATTGGTC TTTATCTTTA TTTAAAATAT TTTCAACATA TTTAAACAAA 4260 
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CTCGTTTCTG GAATAAGATG AATGTCAAAA 
CTAACACCAT CCACAGTAAT AGACCCTTGC 
^ CATTGAATCG TAATAATTTT TGCATTGGCT 

TCATCTACAT GACCGAGGAC AAAATGTCCA 
TCTAAATTTA CTTCTGATTG TCGCTTAACA 

10 

TTAATTACTT GAACAGTAAA AGATGTCTGA 
TTAACACTGA TGGAATCACC AATATGCATA 
ATCGTCCTGA CTGATTGAOG AATTTGAACA 

IS 

CCAGTAAACA TGCATCATCA CTTCTTTCGT 
TCGGAATGAA CAATTTCAAA TTGGTTCGCA 

20 TAAAATTGAT AATTTCCAGA TCCGCCAATT 
TCTATATAAT TAGATTGGAG AAATTCTGAA 
GTTCCAACTC CTCTTTTATA TAAATTGTGA 

25 TAAATAATTT CAATATGTGT TTGATTGGTT 

ATTGGTGTTG ATTCATCTTG ATAAATTTGC 
AATATTACTT TTATAGGGTT TTTTCCATCT 
AATTCAACTG TACGTCTTCC AGTTAACACT 
TCTTGTTTAA CCTCTTTGTT AGTAATCCAT 
CCATCTAAAC TTGCAGATAC TTTCACTGTA 

35 

AAAAAGTCTT GGTATAATTG TGATGCCCGT 
TGAGCCCX5TA ACGTCTCATC ACCATGTGTG 
ACTTTTGCTA TCTTACAATC AATTATTTTG 

40 

CTACATGGCT CTAACGTAAT ATAAATCGTC 
AGTGCTTGAA CCTCCGCATG CTTGTCACCT 
45 CTACCTTCTT TAACTACAAC AGCGCCAACG 
ATATTTGCAA GTTGAATCGC ATAATCCATA 
AAAATCCTCA CATCATGAAT TAAGATGCAA 

SO 

TTGTACACAT TTTTACAAAT ACGCTACATT 
CATCCAGACT TTAACTGTCG GCTCTAGAAT 

55 



CTGTTATCAT GCTTATCAAA TACCGTTAGA 4380 

TTAACTAACT GATTATTAAT ATGTTGGCTA 4440 

GTTTCATTTA TTTTTGAAAC TGTTCCTAGT .4500 

CCAAACCTAC CX3TTACCACT CATGGCACGC 4560 

TCTGCTAAAT AGGTTTTATT TTCAGTGCCT 4 620 

TTAAAATCAA TCACTGTTAA ACATGCACCA 4680 

TCTGCCX3TAA TCTTATGTGC TTCAATTTCA 4740 

CTTTTAACGA CACCTATTTC TTCAACGATG 4800 

AAAGTTAATT TAACATTTTG ATTTAATAAC 4860 

TCTGGTATCT CAATCACATC ATTTGTTTGA 4920 

AATTTCGGGG CATAATAGAG AATAAATTCA 4980 

GTAGTGGTTG GACCTGCCTC GACTAGCAAA 5040 

AGAATTGTTG TTAAATCGCA AGACTTCAAG 5100 

GTTAAATTTG GATTTTCAGT ATATATCCAA 5160 

TGATTAAAAT GAATATTCCC AGACTTAGAC 5220 

TGAATACGTG TAGTATATTG TGGATCATCT 5280 

GCGTCGTGTC GATGTCTTAA CTTATAGACA 534 0 

TGACTTTGTC CATTATCATT CGCTTGTTTA 5400 

ATTTGTGGCA GTTGCTTTGC TTTTGCTTTA 5460 

TCATCATCAA CGCATTCAAC CTCAATACOG 5520 

TCTAACGAAT TGTCTTTTGT TGCGTATACT 5580 

TTAACACAGG GTGGTGTTGA ACCAAAATGA 5640 

GCACCTTCAG CATTTTGTTG TGCCATATCA 5700 

TTTCTCAAGT GTGCACCAAT ACCAACAATC 5760 

GGTGGATTAA CACCTGTTTG ACCTTGTACC 5820 

AATTGACTCA AATGATCACC TCTATAAACA 5880 

GGAGaAAAAT TTATCGTTAA ATAAGCCTAT 594 0 

ATCTTTGTCG ATAATTAACA TTCTTTCTCC 6000 

CTCACTAGAT CAGCCACTAA TATGAAACAT 6060 
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TTaTATATGA AATTGTTATA GATTATTTGA GTACGTAGTA TGTCAACTAC ATTTAAAATG 6180 

ATACTATATG TTTTCTGAAA AAACAATTAA TGACGGTTTT AATTTAATAT AATCTGAGTA 6240 

CTATAGGCAT CTCATTGATA TGATTCrTAC TAACAGACAT TAAAATCAAA CCTTCAATTC 6300 

GTCTCTATAG AGCGTTCTCT TTATTATCTT CTAGTTACAA ATTATTGATT GtCACtGCGC 6360 

TGTTGTTGCT CATTCX3ATTC TAAAGCATCA TATAATTGAG ATACTGTATG CGCAACTTGT 6420 

TCTACAATCA TTTTCACACC GTTTCGTAGT TTATTAACAC CGTTTGTCAT TTGACCTATC 64 80 

GCAATCATAT TTGTTAATGT TCCAAACCTT GGACTAATAA CTTGATTGGT TTCCGGAATG 6540 

ATTTGTATGC CTCCCATTGG GTGTGCTTGT ACAATTTGTC TATTTTCAAG ATTTCTAATT 6600 

AATTGATCAT C1TGATCCAA TTCATTTAAA TGACTTTTTG CACCTGTCGC GTTAATCACA 6660 

ACATTATATA TGTCTACTGA TTCTTGGTTT TTGTATOAAA AATAATACAA CTTGCCATaC 6720 

ATGTTCACAT CTTCTAAATC TTTTTTCAAA ATTAAAGACT TATTTTCTAT TAATTCAATA 6780 

ATTAGTTCAG CAGTTCTTGG AGGCATTGGA TTTGAATTTA ATTGAATCAT CTTTGAGTAT 6840 

TTTTGATTAA ATTGATGTTG GTCTTCAATA CTTAAGCTAT TCCATATCCA ATTTAAATTC 6900 

2S TCTTTCAAAT GTTCAATCAT ACTTTGGAAA ATGCCCaTTT CTGTTGGACG CGCTAAATCA 6960 

TACTTCAAAT CTGCAATATG ATTTCCTGTA CGTCTATGTA CTAATTTTTT AAAATCAATG 7020 

TCATATTCAG CACATTCTTT TAAAAATAAA GAT^CTATUVG TATCAAGCGG TGCATTGCCG 7080 

AAATGATGTT TTTTAATGTC ATTTAATTTG TCTTTA6TTA AGTACTTGAA TGTCACGTCT 7140 

ATCATTGTAC CTCTTACACT TGGTAAATGA GCAGAACGAC TCGTCATAGT AATTGGTAAT 7200 

TTTGGATGAT GAGCAGCAAC ATAACGGACA ACATCTAAAC TGGCAAGGCC TGTACCAATA 7260 

ATCGCAATAT CX3TCCAGTTC ATTTACTTCG TCTAACGTAT TATATGTTGG ATAAGGCGTA 7320 

gcGJ&ATATC CTTTTTTACC CTTTAAGTTA TATGGATCAT GGTAGGCAAA TGTACCACAT 7380 

GTTAAAAATA CATAATCGTA CGCTTGCCAT GATTGTCCTG AATTTGTAGT ACATATGTAA 7440 

TAAGTTAAAT TCXSTTTCATC GATATTAGAA TTTGTATAAA TCTCTTGAAC TTTATTATAA 7500 

TTAGTTGATA TATTTGGATA TTTTTTCGTG AACATAGATA AATAAGATTT CATATAATGT 7560 

CCGAATACAA ATCTCGGTAA ATATGCAG 7588 

(2) I^rFORMATION FOR SEQ ID MO: 143: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10320 base pairs 
^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 





nCTAGGTATT 


TTAAACCTAA TCTAGATAAA CTAGCTTCGT 


AAGCAGCTGC 


TACATTTTCA 


60 


5 


CX3ACCGAAAT CCTCAAAATA TAATTTTGAA GTAATAAATA AGTCTTCTCT 


AGCAATACCA 


120 




GTTGACTCCA 


ATCCGGCACG AATGCCAGCA CCTACTTGTT 


CTTCATTCCC 


ATAAACTTTT 


180 




GCGGTATCAA 


TACTACGATA TCCTTGTTCA ATGGCATACT 


TAACACTTTC 


CATGCAATTT 


240 


10 


TCATCATTTT 


CCACACGAAA TGTCCCTAAA CCAATTTGTG 


GCATCGTGTT 


TCCATTATAA 


300 




AATGTTTTAA 


CCTCCATAAA TATCGCCTCA CCTTTTTGAT GTATTATACC 


CTGTTATCAT 


360 


IS 


AACAAATCTG 


AGTTGAATAC ATGAGAAAAA ACACTTAGAG 


CAATCAACCA 


CTAAAATTCT 


420 


AGTAATATCT 


CTCAAATATT AATCAAATTG TAAAAGTAAT 


TCTGTTTAAT 


TTATGACAAA 


480 




CTAAAAAAGC 


CGAAGTAACA ACATATAGTC ATCACTTCAG 


CCTAACATTT 


AATTGAATGA 


540 


20 


TTCAATTTTA 


TCCATCATTT GTTGTAAGTC TTCCACCTTG 


TATTGAATAC 


GACCATGGAA 


600 




TACAAATTTG 


TTAAAGAACT CGTCTAATTG TTCAGCACCG 


ACAAGCACTT 


TGACAGCACT 


660 




ATTTTGATTA 


TAATTTGAAA TCGTTACATC GCCTTCATTT 


TTAAGATTAA 


AGTATAAAAT 


720 


2S 


TGAAGTTGGT 


GTATATTTGG CACCTAATTC TTTTTGTAAG 


TCTTCAGCCA 


ATTGTTTAAT 


780 




CGCCTCAATT 


TGATCTGAAT AATTTACAAA TGATAATGAA 


CGTTTGTCAT 


CATTTTGATC 


840 




CATCACAATA 


GTTTGCGGTC TAGATTTATC TAAATCCAAT 


GTATCAAATA 


CTTGTTCCAT 


900 


30 


TGGTGGTAAA 


TCTTTAAATT GACCGCCACT AATACCATTA 


TAAACATGAC 


CTTTTAACAA 


960 




TTGAGAATCA 


ATAATATAAA GACCAGTTCT TGTTAATACT 


AAATGACTAA 


TTCXjTTCAAT 


1020 


35 


ATTATTAAAG 


CCATCCTTTG GTAAAAAGAT ATTTGCCATA ATGTGCATAT 


CTTCTGGTCG 


1080 


AATTCGTTTT 


TCTTTAACTA ATCTTTCACG AATACCAATT 


AATCTCATGT 


CCGTTACATA 


1140 




TTCAGTATGA 


TTTTTCGAGA ACAATTTTAA TGCGTCAATC 


TCACGATCTT 


TTGTACTAAC 


1200 


40 


CATGTGATTA 


TAATCTTCTT GTTGTTTTGT AATTGTCTTT 


TTATTTTGAA 


tacgctcttt 


1260 




CTCTAAAGCT 


TCTTCATGAG ACTTTTTAAT GTTTTGTTCT 


TGTTGTTCAT 


ACTTTTCTTC 


1320 




TGTTTGTCGC 


TTAACTTTTT TCTTACTACC TAAGGCAACT 


AAAAAAAGGA 


CAAAAAAGAT 


1380 


45 


TAATGCAATG AgCTACTGCA ATAATGAGTC CAATGACTAT CGGTGAAGAT 


AAATCCATCA 


1440 




CAACAACGCT 


CCTTTTTAAT ATATGAATAA CTTTAATTAT 


AATAGAaAAG 


CTAAAGATTT 


1500 




TCGATACATA 


TTATCATTTA TATACCGAAA ATCTTTTATT 


TAGCTATATT 


CAATTCATCT 


1560 


50 


TATTATTTTA 


CTGCGTCTTT TAATTCTTCC ACTTTGTCTA 


ATTTTTCCCA 


TGGGAATAAG 


1620 




ACATCTGTAC 


GTCCAAAATG ACCATAAGCA GCAGTTTGTT 


TGTAAATCGG 


TTGTTTCAAA 


1680 
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AGTTGCCCTT CAGAAACTTT ACCTGTTCCA AATGTATCAA TTGCAATTGA CACTGGTTCT 1800 

GCAACACCAA TCGCATATGC CAATTGTACT TCACATTGAT CTGCTAAACC TGCTGCAACA I860 

5 ATATTTTTAG CCACATAACG TGCAGC6TAT GCAGCTGAAC GGTCTACTTT TGTAGGATCC 1920 

TTACCACTGA AGCATCCGCC ACCATGACGT GCATAGCCAC CGTACGTATC AACAATGATT 1980 

TTACGTCCTG TTAATCCTGC ATCACCTTGA GGTCCACCGA TTACAAAGCG TCCTGTAGGA 204 0 

10 

TTGATGTAGA ATTTAGTTTG TTCATTAATC AAGTTTTCTG GAACAGTTGG ATAAATGACA 2100 

TGTGCTTTAA TGTCTTCTTG AATTTGTTCA AGTGTCACAT CCTCAGCATG TTGTGTTGAT 2160 

ACGACAATCG TATCAATACX5 TACTGGGTTA TCATTTTCAT CATATTCAAC AGTGACCTGA 2220 

IS 

ACTTTACCGT CTGGTCGTAA ATAATTTAAC GTACCATCTT TACXjCACATC TGATAAACGT 22B0 

TTT6CCAATT 6ATGTGATAA ATAAATTGCT AGAGGCATAT ACGTCTCTGT TTCATTCGTT 2340 

20 GCGTAACCAA ACATTAAACC TTGGTCACCT GCACCTGTTG CTTCAATTTC TTCTTC6CTA 2400 

TCTTTATCAC GATACTCTAA TGCTTTATCC ACGCCTTGTG CAATGTCAGG TGATTGTTCA 2460 

TCAATCGCAG TTAAAATTGC CATTGTTTCA TAATCATAAC CATATTTTGC TCTTGTGTAT 2520 

25 CCAATTTCTT TAATTGTTTC TCTAACAACT TTCGGAATAT CAACATATGT TGTTGTA6AA 2580 

ATTTCGCCGG CGATCAATGC CATACCTGTT GTAACAGTTG TTtCACAAGC TACACGTGCA 2640 

TTTGGATCGT CTTTTAAAAT AGCATCTAAT ATTGCATCTG ACACTTGGTC AGCGATTTTA 2700 

30 

TCTGGGTGTC CTTCTGTAAC AGACTCTGAA GTAAATAATC GTTTGTTATT TAACATAGTT 2760 

TGCTCCTTTA AATTTATATT ACGAAAATTC TCTCTCTGTG AGCTAAATAA AAAAGACCTT 2820 

CTAACTATTA ATATAGAGAG AAGGCCTAAT ACGTCCATTC GCTCTTATCG TTCAGACCTA 2880 

35 

TTTGTCTGCA AAcGGTTTGG CACCTTTCTT TTATAAAAAA GAGGTTGCTG GGTTTCATTG 2940 

GGTCCATGTC CCTCCACCAC TCAGGATAAG AGAATCCGTT AAAAATAATA GTACCTAATT 3000 

AATGAATTAA TGTCAATTTT TCACAAATAA ATTTACAGTA AAATATTGTA GATTAATTAT 3060 

GTTAATGTGT TATACTAATT AAATGTAAAG GCTTACATTT AAATTATCGC TTTGGAGGGA 3120 

TTTAGGATGT CAGTAGACAC ATACACTGAA ACAACTAAAA TTGACAAATT ACTGAAAAAA 3180 

45 CCAACGTCAC ATTTTCAACT TTCGACGACA CAACTTTATA ATAAAATCTT AGACAATAAC 3240 

GAAGGGGTAT TAACAGAACT TGGTGCTGTT AATGCAAGTA CTGGAAAATA TACTGGTCGT 3300 

TCGCCTAAAG ACAAATTTTT TGTCTCTGAA CCTTCATATA GAGATAACAT TGATTGGGGA 3360 

SO 

GAAATTAATC AACCTATCGA TGAAGAAACT TTCTTGAAGT TATACCATAA AGTACTAGAC 3420 

TATTTAGATA AAAAAGATGA ACTATACGTA TTTAAAgGcT ACGCTGGTAG CGATAAAGAT 3480 
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ATGTTTATTA GACCTGAATC AAAAGAAGAA GCTACAAAGA TTAAACCTAA CTTCACTATC 3600 

GTTTCTGCAC CACATTTTAA AGCAGATCCA GAAGTTGATG GTACTAAATC TGAAACCTTT 3660 

5 GTCATTATTT CATTTAAACA CAAAGTCATT TTAATCGGCG GTACTGAATA CGCTGGTGAA 3720 

ATGAAAAAAG GTATCTTCTC TGTAATGAAT TATCTCTTAC CGATGCAAGA TATTATGAGC 3780 

ATGCATTGCT CAGCAAACGT TGGTGAAAAA GGCGATGTTG CATTATTCTT TGGTCTATCT 3840 

10 

GGCACTGGTA AAACAACCTT ATCGGCTGAC CCACACCGTA AACTAATCGG TGATGATGAA 3900 

CACGGCTGGA ATAAAAACGG GGTCTTTAAT ATCGAAGGTG GCTGCTATGC AAAAGCAATT 3960 

AATCTTTCCA AAGAAAAAGA ACCACAGATT TTTGACGCAA TCAAATATGG TGCAATTTTA 4020 

75 

GAGAACACTG TAGTTGCAGA AGATGGTTCA GTGGACTTTG AAGACAATCG TTATACAGAA 4080 

AACACGCGTG CCGCTTATCC AATTAATCAC ATTGACAATA TTGTAGTACC ATCTAAAGCA 4140 

2^ GCACATCCAA ATACAATTAT TTTCTTAACT GCGGATGCAT TTGGTGTTAT TCCACCGATT 4200 

TCAAAGTTAA ATAAAGACCA AGCAATGTAT CATTTCTTGA GTGGTTTCAC TTCTAAATTA 4260 

GCTGGTACAa GCGTGGTGTG ACAGAACCTG AACCATCATT CTCAACATGT TTCGGAGCAC 4320 

25 CGTTCTTCCC GTTACACCCT ACTGTTTACG CTGATCTATT AGGTGAACTT ATCGATTTAC 4380 

ATGATGTTGA TGTTTATCTT GTTAATACTG GATGGACTGG CGGAAAATAT GGTGTAGGAC 4440 

GTAGAATCAG CTTACATTAC ACACGTCAAA TGGTAAACCA AGOGATTTCT GGCAAATTGA 4500 

OA 

AAAATGCAGA ATATACAAAA GATAGTACGT TTGGTTTAAG CATTCCTGTA GAAATTGAAG 4560 

ATGTACCGAA AACAATTTTA AATCCAATTA ATGCTTGGAG CGACAAAGAG AAATATAAAG 4 620 

CACAAGCAGA AGATTTAATT CAACGTTTTG AAAAGAACTT CGAAAAATTT GGTGAAAAAG 4680 

35 

TTGAACATAT TGCTGAAAAA GGTAGCTTCA ACAAATAAAT TTGAATACTA AATCaAAACC 4740 

ACCCSGTGTGA ACGGGTGGTT TGTTCTGCGG CTATAAGCCT TCCTTACTGG CCAGCCCTAA 4 800 

AAGGGCACTG ACAAGTCAGC CAACTGCACT ACTATTCCAG CAACCCTAAA GGGTTACTCT 4 860 

40 

TTTTTCTTTC TTTTTTTATT TTTCTCTCCA GTGAAAGGAT CTAAATATTC TTCCATTGAG 4 920 

ATTTGGTCTG CAACGATATC CTCTTGTAAT TGATTACGAA TATAATTTTC AATCACTTTT 4980 

45 TTATTTCTAC CTACTGTATC CACATAAAAT CCTTTACACC AAAACTTTCT ATTTCCATAT 5040 

CTATACTTTA AGTTAGCATG TCTATCAAAT ATCATTAAAC TACTTTTTCC TTTTAAATAG 5100 

CCAACAAATG ATGATACCCC AAGTTTGGGT GGTATACTAA CTAACATATG GATATGATCT 5160 

TTACATGCCT CTGCTTCAAT TATCTCTACA CCTTTTCTTT CACATAATTG ACGCAATATA 5220 

ATCCCTATAT CTTTTTTTAT TTTTCCATAT ATCACTTGTC TTCTGTATTT AGGTGCAAAG 5280 
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AAATAGCATC TCCTCGTGTT GATTATTTTG GTTGGCTGAC CAATATTTAT TCTAGCACGT 5400 

AGAGATGCAT TTTTTGTGAC AATGGTAGAA CCTTTTCtGa ACCATACGCA TAGCGTATGG 5460 

TTTTCTTTTT ACAATTAAAG AGCCAACCGT TGTTATAGTC TAACAATGGT TGGCTCCTCT 5520 

TATTTTATGT GCTAAAAATT TATAGGCAAT TTTATTACAA CAATGTACAT TTAAGGTGAC 5580 

CTTCATGCCA AAATCGCATC ACTCATTTAA TGGAAGCAGC ACGTCTTCAT ATAAAGTACC 564 0 

GATCCCTAAT TCAACGCATG TAGTACCACA TCTTCAAAGC TTGATAGTTC CCATGCGCAC 5700 

ACCACGTTTC ATACTAGCTA TGCXSACTCAA CTTGGTTCAT AAACTCTTTA ATATAAGTCA 5760 

ATGTTTCAAC CATCGCTGGT GGTCTTGGCA CATGTCCTTC TGCCATTTGA TAAAATGTTT 5820 

CATGCGTGGC ACCTTTTAAC TCTAGTTGGT CCGCTAAATA ATACGCATGA TGAATACCAA 5880 

CTTGCTGGTC TTTCCCTCCA TGTACAATTA ATATTGGCGG ACTGTTTTCA TTAATGTTTG 5940 

20 GAATCGCTTG GCGTGCCTCA TATGCCX5CTC GATCTTTTTT CGGATGACCA ATCATTCTTC 6000 

GTAGCATGCC TCTTAAATC6 ACACGTTCTT CATACATTAA ATCAATATCT GAGACACCAC 6060 

CCCAGATTGT ATAACTTGTT ACTGGTAAGT CTTGAAATGT CAACAATCCT TGTAAACCAC 6120 

2S CTCGCGAAAA ACCAACCATG TGGATAAATG CATGTGGATA TTTATCATGT AGCAACCTTA 6180 

ATAATTGCGT CACATCATTT AAATCGCCAC GGTAAAATTC GTCTTTGCCT TCACTCCCAT 624 0 

TGTTACCTCG GTAGTATGGC CCAATCACTA AAGTTTGACT ATCTGAAAAT TGCATTAATC 6300 

TACCTGCGCG CACACGTCCT ACTTGACCTT TGCCACCTCG CAAATAAACT ACAATGCGAT 6360 

TTACTTCATG ATGTGGTGTC ATCATTAAAG CTTTTACTTG TAAGTCATCT GACAAATATG 6420 

TAATTTCTTC GAATTGATGC GTAAAATATT CAATTGGCAT TCGTTTACGT TTGATAAAAC 6480 

CCAAGTGATT GCACCCTCTC TACGCATTTT AAAATGGTAC TATCTTGCAG TAAGAAACTC 6540 

CGTTGTGCGA GTTCAATATC ATTGATACAG TTAAACAACA CTGGCCCTGC TGTTTCTAAA 6600 

TAATCGTTCT TGCTTACCAA TGATTCAACT TCGATAAAAT ATACATCTTT TACAAAATCA 6660 

GTTTGATCAT GTGTTTCAAT GGTATATTGT GCTATGTAAT AAATATTTTT AACTTTGGCG 6720 

CCTGTTTCTT CATATAATTC aCGTGTAACT GCTTCAGCAC TACTTTCCCC GCGTTCCCTT 6780 

45 TTACCACCAG GAAATTCAAT CCCCCGTAAA TTATGTTTGG TAAAAA6CAA TTGATTTTTA 6840 

AACGTTGGAA TAGCTAGCAC ATGATTGCCA TCTGCTATCT CATTATCCTT TTTAAATGTC 6900 

AAATTAACTT GACGATTATC TTTATCCCTA AACTTCACGC GCATCACATC CCTACATTGT 6960 

SO 

ATGTTAATAT AATAGTTAAT TACTATCGTT GGAGGCATTA ATTATGAAAA AGATATTCTT 7020 

GGCGATGATT CATTTTTATC AACGTTTCAT TTCGCCACTC ACTCCACCAA CTTGTCGTTT 7080 
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5 



10 



IS 



20 



3S 



40 



CCTTTATTTA GGTATCCGTC GTATTTTAAA ATGTCATCCG 


CTTCATAAAG 


GCGGCTTTGA 


7200 


CCCTGTTCCG 


TTAAAAAAAG AGAAGTCAGC AAGCAAGCAT 


TCACATAAAC 


ATAACCATTA 


7260 


ATATGGTTGT AATTGAGTTA TATCCACTAA AGGGGGGCGA 


AATTCGAGTC 


GCCCCTCTTT 


7320 


TAATATGCCT 


GAATGCGCCA CCACATCTTG 


TTCAAAATAA 


TAACCTGCTG 


GTGTAACATC 


7380 


TCCTGGATAA 


TCACCTTTAC GAGCAAGCAT 


CGCTGTAAAA 


TAGCGGCTTA 


AACCATATTC 


7440 


GTACATGCCG 


CCAATAACCA CTTTTGCACC 


ATGACTTTTC 


AAAGTATCAA 


TTGCCGTTTG 


7500 


CACTTTATCA 


ATGCCACCTA GACGAAATGG 


TTTTAATACA 


ACAACTTTCA 


CATTGTATAA 


7560 


TTCTATCAAA 


TTAATTATGT CCaACAACGA 


TGTTGCCTTT 


TCATCAAGGG 


CTATTGGAGG 


7620 


TATTGTTCCA 


TCCGCTACTT CATCAAGCAT 


GGAGATATCT 


TTAAATGGCT 


CTTCGATATA 


7680 


AAGAACCTGT TCACGCGCTA ATAACTGTAA 


CTGTGTGAAA 


TCTTGACGAT 


CCAA6GACTC 


7740 


ATTTGCATCT 


ATAACCAATT GAAAGTGAAA 


GTCTAATTCC 


CGTAACACTC 


TAATTTGATG 


7800 


CATGATTTGA 


GGCGTCCATT TTAATTTAAT 


TCTGGTCGGC 


TTTGTTGCTT 


TTAATGACTC 


7860 


TAGTTGTTTA 


TTTGATAAGC CGCTCGcTGT 


CGCTCCATAT 


GCTACTGAAA 


ATGAAGGCAG 


7920 


TACATGAAAC 


ATTTGATACA ATGCCATGAC 


AATAGTTGCC 


CTTGCAGCAG 


GCGTATTTTC 


7980 


CAATGAATCT ACTAATTTTA GTGCTGCTTC ATACGTTTCA 


AATGATTTAT 


TTCTATTATC 


6040 


TTCGAACCAT 


TGCTCAATTA CATGTTTCAC 


TGAGGCAATT 


GTTTCATGAT 


CATACCAATC 


8100 


TGTTTGAAAA 


GCGTTACATT CCCCGAAATA 


TGCATTTCCT 


TTGTCATCAA 


TCAATTCGAT 


8160 


AAACAAACAA 


TCACGATGCG TTAAAGTGAC 


TTTCGGTGTT 


ACAATTTGTG 


ACTTAAATGG 


8220 


CTCACTATAT 


TTATAAAAAT GCAAAGCTGT 


CAACTTCATC 


AAATCATCCT 


CTATACAACT 


8260 


TATTTCTTTG 


TAATTTACCT GTTGATGTAT AAGGTAAAGT 


ATCAACCTTT 


TCAAAGTGTT 


8340 


TCGCTACTTT ATATTTCGCT AAATGTTGTG ATAAATATGC 


AATCAATTGT 


GCCTTTGAAA 


8400 


TGTCACTTTC 


ACTGACAAAA TATAATTTAG 


GCACTTGGCC 


CCAAGTATCA 


TCAGGATGCC 


8460 


CTACACATAC 


TGCGTCACTG ATACCTGGAA ATTGCtTCGC 


TACCGTTTCA 


ATTTGATATG 


8520 


GATAAATATT 


TTCACCGCCA CTAATAATTA AATCTTTACG 


TCGGTCATAA 


ATCATGACAT 


8580 


AACCTTCATG 


ATCTATTTCA GCAATGTCAC 


CCGTATTAAA 


ATAACCATTT 


TCAAACGTAC 


8640 


CCGTTAAATC 


TGTTGGATAC AAATATACAT 


TCATCACATT 


GGCGCCTTTA 


ATCATTAATT 


8700 


CTCCATGACC 


TTCTTTATTA GGATTTTTAA 


TTTTTACGTC 


AACATTGGCA 


CTTGGCATCC 


8760 


CTACAGTGTC 


AGGACGTGCA TGCAACATTT 


CCGGTGTTGC 


TGTTAAAAAT 


TGCGAACATG 


8820 


TCTCAGTCAT 


ACCAAATGAA TTATAAATTG 


GCAGGTTATA 


TTGTAATGCC 


GTCTCTATCA 


8880 
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AACCTTGTTG CATAAGCCAA TTTAAAGTTT GTGGCACAAG CGAAATGTGC GTGATTCGTT 9000 

CATTTTTAAT CATCGTTAAA ATTTGTTCGG CATTGAATTT ATCAACAATG CGCACAGTAA 9060 

AACCTTCAAT AACAGCTCTT AAAAGTACAC TGAGACCCGA AATATGATAA ATCGGCAAGA 9120 

CAGATAGCCA ATTAGTGTCA CGATCAAATC CCAAGCTCTC TTTACATCCX3 ATTGCACTGG 9180 

CATAATGATT ACGAAACGTT TGTGGCACCG CTTTTTGAGG GCCCGTTGTC CCTGATGTAA 9240 

ACATAATCGA TGCAATGTCA TCTAAATTAA ATGATGTATT TAATATGTTG GACGGCGACT 9300 

CTTTCGGCAC CACAGTTTCA TTCGATGTTT CATATTGGAT ACCCATTGTG TTGTCCAACA 9360 

AACTGTTCGT TGTAATATCC CTTCCAGCGA ATTCAATATC ATCCAGCGAT ACAATTTGAA 9420 

ACCCTCGTAA TTCCAGTGGC AAGGTACAAA AAATCAATTG TACATCGATT GACTTCATCT 9480 

GATTCGTCAT CTCATTAGGT GTCAACCTTG TATTAATCAT CGCAATTTCA ATATTTGCCA 9540 

20 ACCAACATGC ATGTATTAAA ATGATCGATT GAATCGAATT ATCTATGTAT AGCCCAACAC 9600 

GAGATTGTTG ATAAGCCTTG AGTCTTTTAG CCAATAGACT CGCTTCACAG TATAAATTTT 9660 

QATAAGTATA A6ATTCTTGA CCGTCTGTTA TCGCAATATG ATGTCCATTT TGTTGTGCTT 9720 

25 GTTTATATAA CCAAAAGTCC ATGCGTTATT CCTCCAAAAT CATTTACATT ATAATTATAA 9780 

CGATTTTATG ACATTCTAGC AGTGGTTATG TTTAAAAATA TAAAAAAGTA GACGAATTGA 9840 

TGCATTGATA TGATTGTTAT AATGCTCAAT ACATATCGTT ATATCATTCG TCTACTATTA 9900 

TCAGTTATTT TTATTTAATT TTAGTGTCAT TCTGTCATTT TGATGTGGTG ATTTACCCAT 9960 

TGTTGCCACA TCATCTGCAA T6TCAATTGG TATACGGTTC ATGTCTTGTA ATGCACTTAA 10020 

ATGGAATACT TCATCATCTA AATTTTCAAT GAGATATACA TAATATGTTA CCTTGTCCTT 10080 

TTTATATTTT AACGTTTTCC AAAAGTCCGG CTTGCAATTC AATACATTAT CCGGAATATA 10140 

TTOJATAAAT AAGTAACGTT TGCTGCCTAC TTTGTCTATG AAATATTTTG CAGTGCCTTT 10200 

40 TTCTATACCT CTTATATGTG CATAGTCTGC TGAAAAGTAA ATACTACCTA TTGTTTCATT 10260 

ATGTTGTTGT ATTTCAAATC GTTGGCCTAC TATTTTATTA TTTGTGCTAC nGGGGACTTA 10320 
(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1477 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
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6TGTGGATTG GATTTTAAAA 


TCACCCTCAT AAATACTGTC 


ATCAATATGA 


TAAGTTACAA 


120 




TTTCACCTAT TATTAAATCA GCCCCATCTA ATACATCTCC 


AAGCAATATC 


ATTTGCGmTA 


180 


5 


GTTTACATTC GAATCTCATT 


TTCGCATCTT TAATTCCTGG 


CGTCTTAATC 


GTTGTAGATG 


240 




TTAAAAGTGA TAATTCTGTA 


CGACTCAACT CACTGTCACC 


ATATGCTAAC 


GGCGCTGCAG 


300 


10 


TCTCATTAAT ATCTTGAACA 


TTATCTTCX3T CTGTAATATG 


CACAACAAAG 


TCTCCAGTCC 


360 


GTTCTATATT TAATGCAGTA 


TCTTTTCTCT TACCTCCTGC 


ACXTTTGAACT 


GGAATAGCAA 


420 




TCATTGGCGG ATGATTATTA ACAATATTAA AAAAGCTAAA TGGTGCTGCA TTTACTGATG 


480 


IS 


CATCTTGATT TAAT6TTGTA 


ACAAAAGCTA TAGGTCXHXSG 


AATAATTGAA 


CCAATTAATA 


540 




ATTTATAGTT TTCTCTAGCA 


GTTAATGATT GTGCATCAAA 


CGTATACATA 


ATACCTACCT 


600 




CTTTTCTAAG TATATCTAGG 


TATTTCTCCG ATTTTGGTTA ATTTAAACAT 


CTATTCTCCT 


660 


20 


CTGAAAATCA CTTGTATTTA 


TTTAGCAAAT CTTTTGAAAT 


ATGACACATA 


TGCATATCTT 


720 




CTGGATATTT TTCTAAATGT 


TGCTGATGTT CTTCAGCACT 


TTTAATGTAG 


TTAGACAGCX3 


780 




GTAAGACTTC CACTGCAATT 


TGATCTCTGT CTTTACGTCG 


TTCAATGAAC 


TGACGCGCTT 


840 


25 


CAATTAAGTG GTCATCTACA 


CAACTATATA AACCCGTTCG 


ATACTTTTGT 


CCAATATCAT 


900 




TTCCTTGTTG ATTCACACTG 


TAAGGATCAA TGATTTCAAA 


TAAATAATTC 


ATAATGTCTG 


960 


30 


TAATTGTTAA CATACGATCA 


TCGAAATGAA GTTTGACACA 


TTCAGCATAA 


CCATCATACG 


1020 


GACCGTCTAA TTTAGAGCTT 


CTTCCATTTG CTCTTCCTGC 


TTCTGTATGT 


ATAATTCCAG 


1080 




GTATTGTTGC AAAAAATGCT 


TCAACACCCC ATAAACATCC TCCTGCTACA TAAACAACTG 


1140 


35 


CCATATTTAC ACCTCATCAT 


CCTTTTTTAT ATTTTTAACA AGGTTATACC 


ATTTAATACC 


1200 




GCCATGACAT GATTCTGATA 


CACCTTCATT ACGATACCCA TATTTTTCAT AAAATGAAAT 


1260 




TAATGATTCT CGACATGTTA 


ACGTTACACC ATGTCGATGA 


TGATTCTTAG 


CAAGAGTTTC 


1320 


40 


AAAATAGTTT AGTAAGCGAC 


CTGCAATACC CTGACCTTGA 


TAATTTGGTG 


CTACAACAAG 


1380 




ACCTAACACA CTAATATAGC 


CACCTTCACT ATTATTTGTG 


GAGACATTTT 


TAAATAAATC 


1440 




ATCGCTAATG TAACGCTCTT 


TTATGACTGG ACCGTT6 






1477 


45 


(2) INFORMATION FOR SEQ ID NO: 145: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 976 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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CACTATCATA ACATGCATCA GCTACAATAT ACTCCGGTAA ATAACCGAAG nTATTTTgAA i860 

TCATTGTTAA AAATGGAATT AAAGTTCTAG TATCTGTTGG GTTTTGAAAT AGGTCATAGG 1920 

5 ATAAAACAAA TTGAGAATTT GTCGCTATTT GTAAATTGTA TCCTGGCTTA AGTTGGCCAA 1980 

AGTGTCTTAT TTTTTTAAAG TATTTAAAAG TAAAATTACA TGTTAATACG TAGTATTAAT 2040 

GGCGAGACTC CTGAGGGAGC AGTGCCAGTC GAAGaCAGGG GCCCCAACAC AGAArcTGAC 2100 

10 

ATATAGTCAG CTTACAACAA TGTGCCGGTT GGGGTGGCTG AGACGGCACC CTAGGAAGGG 2160 

ACCCGTCATC AAAAATTCTA TTTATAGAAT TTTACAGTAA TGTGCCAGAT GGGCATAGCG 2220 

AAgcCATTCA ATACGAAGTA TTGTATAAAT AGAGAACAGC AGTAAGATAT TTTCTAATTG 2280 

IS 

AAAATTATTT TACTGCTGTT TTTTTTAGGG ATTAATGTCC CAGACTCTTT AGTTTATTTA 2340 

TTTTCAATAT AACAATTGTC TAATCAAGGA TTAACGAATA TTTAAAGATA GTTTGACGCA 2400 

2^ ATATTAGAAA CAACCTATAA TAATAGTTTG TTTGTGGATT AACTATTATA AATAAAAGCG 2460 

GCGTAAAGAC ATATAAACCA ACTACTTGAA CAATATAACG TTAATAACAA TCTATACTGA 2520 

TACATTACGC CTAGATAATC TTTGATGAGC ACATGTAAGA AAAAGTGATA TGGTGTATGA 2580 

25 CTTCCGACAC CATCGATAGA TAAACCTAAT TTTTGGGCTA GTCGTAAGGC GCX5CAATACA 2640 

TGAAACTGAC TTGTtACACA AACAATTTTA ACTGCTTCAT GATACAAATT GTTGATGATT 2700 

TGTTTAGAAT ATAAAAAGTT TGTGTATGTA TTTATAGAGT GAGATTCCAT TAGTATATCT 2760 

30 

GTTTTATCAA CACCATGTGC AATCAAATAA CGTTGCATAG CTAAAGCTTC AGAAATTGGT 2820 

TCGTCTGGTC CTTGTCCGCC AGATACAATG ATCTTTGTTG CTGATGCTTG TTGTTGATAG 2880 

ATATCAAGTG CACGATCTAA ACGCGCTGCA AGCATTGGTG TGACAAATTC GGTAAAAATA 294 0 

35 

CCAGCACCTA ACACAATTAT GATATCAACT TCTTTGTTGT ATGATCTATG TCTATATGAT 3000 

ACTOTCCAAA CGAGATAACA AATAAAGGTT AGTAACAGGG AAAGACATAA TATAGCTAAC 3060 

CACATAGACA AACCTTTCAC AATAGGTGAC TGAATCGTAC TTATAAATAG AAGTGCTGAT 3120 

40 

GTGTAGAGTA CAAATTTATA TGAAAAAQAT AATAATTTTT TAATAAATAA GCGACTAGAA 3180 

GTATGAGAAA ATAAATATCT ATGTTTGAAT AGCATGATAA TACTGATTAT TATAAATGTT 3240 

45 ACAAACATAG ACCAAGGGAA AGTATAGGTC ATGATGCTAT AGATGAGTGA CAAAAATATC 3300 

GATATGACAA CTAAGATGTA GCATGTTAAA TTTAACGTCA GAGTATAGTT GAAAATTAAC 3360 

GGACAAATAA CGATAAGTAT AAATATTAAT AATAAATTCA ATAACATACT GACACCTCGC 3420 

SO TTATAATAAA TATTAAATAT AAATGTAGAT GATTTAATTT ATTAAAGCAA GGAGAAAGCA 3480 

GCAACATGTA AATCTTAATT TGTTATATTA TATATGGGTC AATATTTTTG TGTTTTTTAG 3540 
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TATGGTAAAA CATTTACAAG ACCATATTCA ATTTTTAGAG CAGTTTATAA ATAACGTTAA 3660 
CXSCATTAACT GCAAAAATGT TGAAAGATTT ACAAAATGAA TATGAAATTT CATTAGAGCA 3720 
^ GTCTAACGTA TTAGGTATGT TAAATAAAGA ACCTTTGACA ATTAGTGAAA TCACGCAAAG 3780 

ACAAGGTGTA AATAAGGCCG CAGTAAGCCG ACQAATTAAA AAGTTAATCG ATGCTTAATT 3840 
AGTTAAGTTA GATAAACCAA ATTTAAATAT TGATCAACGT TTGAAATTCA TAACCTTAAC 3900 

70 

TGACAAAGGT AgAGCATATT TGAAAGAACG TAATGCGATT ATGACAGATA TTGCGCAAGA 3960 
TATTACTAAT GATTTA 3976 
(2) INFORMATION FOR SEQ ID NO: 146: 

75 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3346 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 



25 


GCTACCTAGG 


CATTTAAQAG 


ATCAAAAAAT 


GTATGAATAT 


GAACGTTATT 


TTTATGAGCA 


60 




AGAACTTAAT 


GGCGTTGATG 


aAGGGGAAAT 


TTTAAAGAAG 


TTAAAAGACC 


CACAAGATGT 


120 




TGCAGCTGAA 


ACAAAAGCTA 


GAAGTGTTAT 


TGATTATGCT 


GAATCTAAAC 


CAACATTTGA 


180 


30 


AAATATTTCA 


AGAGCTGTTG 


CTGCTTCATT 


AAGTTTAGGC 


ATTCTATCTA 


TTTTTGTCAT 


240 




CCTTATACCA 


GTATCTATAG 


TTGGATTATT 


TGTATTAGCA 


TTATTTTTAA 


TATCACTTTT 


300 


35 


GCTGCTGTTT 


TGTCCAATTA 


TTTTATTAGC 


ATCAGCAATA 


TCCAGAGGAA 


TTGTGGACTC 


360 


AATTAGTAAT 


GTATTTTTTG 


CCATATCATA 


TTCAGGATTA 


GGATTAGTAT 


TTATCATTGT 


420 




CATATTTAAG 


ATTTTAGAAT 


ACATTTATCG 


TTTAATCTTA 


AAATATTTAC 


TTTGGTATAT 


480 


40 


TAAAACTGTC 


AAAGGAAGCG 


TTAGAAAATG 


AAGAAATTCT 


TTTTTATTGG 


GCTTTTAGTG 


540 




TTTGTTGTCT 


TTTTTACAGC 


AGCAACCATT 


ATTTGGTTCA 


GCTATGATAA 


AAACAAATAT 


600 




GGTACTAAAC 


AATATGATAA 


AACATTCAAA 


gACGATGCTT 


TTGACAATGT 


ATCTATAAAT 


660 


45 


TTGGATAGTA 


CAGAACTTCG 


TATAAAACGG 


GGGAATCAAT 


TTAGAGTTAA 


ATATGATGGT 


720 




GACAATGATA 


TATTAATTAA 


TATAGTAGAT 


AAGACGITGA 


AGATTAGTGA 


TAAAAGGTCT 


780 




AAGACAAGAG 


GATATGCAAT 


TGATATGAAT 


CCTTTTCATG 


AGAATAAGAA AACGTTAACG 


840 


SO 


ATTGAAATGC 


CTGATAAAAT 


GATTAAACGT 


TTAAATCTAT 


CATCTGGAGC 


AGGAAGTGTT 


900 




AGAATCAGTG 


ATGTTGATTT 


AGAGAACACA 


AGTATTCAAA 


GCATTAACGG 


TGAAGTAGTT 


960 
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AGTAAAAGTA ACATTAAAAA TAGCAATATT AAAGTTGTTA TTGGTACGCT ACAAATCGAC 1080 

AAGAGTCAAA TTAAACAATC CATATTTTTA AACGATCATG GTGACATTGA ATTTAAAAAC 1140 

ATGCCATCAA AAGTAGATGC AAAAGCTTCT ACTAAACAAG GAGATATTCG TTTTAAGTAT 1200 

GATAGTAAAC CTGAAGACAC TATACTAAAG CTAAATCCGG GAACGGGTGA TAGCGTAGTT 1260 

AAAAATAAAA CATTTACTAA TGGtAAAGTT GGGAAAAGCG ACAATGTTTT AGAATTTTAT 1320 

ACGATTGATG GTAATATCAA AGTTGAATAA ATAAAGGATG TAAGCACCGA TATTAGGAAG 1380 

CATAATTTCT CTAATATCGG TGTTATTTAT TTGTTGGCAA AAGTTAAGTC GGTATCTATA 1440 

TTGCCAGTAA AGTGAGTGAT ATTAAGGTCT TGACCATCTA ACCATGATTT GAAATCTATT 1500 

ATTTCTGGTG GOGCATTTTC TCCCAATGTA AAATATGCAG TTAATGTTTC AGGTTGATAC 1560 

ATTGATGTAT GGATGGTGCC AGACCAGCTT TTGAATAGTT TACTGTAAAT TTCATACTGA 1620 

20 GGATTATTGA ATAACTTAAA TGCTGTAGTC ATATCTAAAT TATCATTAGT TTGTGAAATG 1680 

GTACGCXSCCA GTCTTTCTTT AGATTCTTTT GTATAATTAC GATTTTCATG TGTTAATATT 1740 

TCAAAATGAT TTGTACATAT ATTATCATAA CGAACATCTA TTGATCTCGG TGTCACTTCA 1800 

25 ACAATTGCAT GGTTCAATGA TTTGTCCATC AGTATGTAGC TAAATGAGCT TCTGTGTGGT 1860 

ATTTCTTTCA ATAATTGGAT TGCTTCTGTT ACATTTCGGC AATTTTCAAG AATTAGACGA 1920 

CCAATCATAT AACATACAAA ACCATTTGCT GGTTTCTTCC GGTGCATAAA GTTATAGCCC 1980 

ATAGTTAATC CTGACTCATT CATACCATCC ATTCTTCCAG TTACCCTTGA TACAGGACCA 2040 

ATTTGAGCTA AACCGCTATC TGTAGGTTGA TAAAGTAAGT AGCGACCATC ATAAGTTGCA 2100 

GGGTGGTAAT CATAATTTCT AACCATGAAG TCTTTGCCTT GAAAGACCGT GCAaCCACTT 2160 

TCTTTTAAAT CGGTAAAACG ATAATGTCCA AAGTTTAAAA TAATTTGGCG TGTTGGCATT 2220 

TTGAGTATAC TTTGTAGTCC CATTAATTCT TCCCATATTT GAGGTGCGTA TGTTTGGAAT 2280 

ATTTGATAAG TTTCATTTAC ATCTATATCG AAACGTGGGA CaCnTTTTTT CCATTCTTTT 2340 

TCTCGATTTT TTAGAAGAGG TGTTTGTTGA AGCCATTTAC CAGTTTTAAC ACCTAACTCG 2400 

AAATGTGAAC CTCTAAAAGT CATGATATCT GATGTCACTT 6TTGCATATC ATCGGCCCCT 2460 

45 TTCTTTTTAG TTGTAATATA TTGT7UVATAA ATAGTAATCG TATGTATATT GAATGTCATG 2520 

TTAAATAAAG TTATATTTTA CTAAATGAAA TATAAAATTG TTTGAGGTGA TTTCTCGGTG 2580 

TATAAGACTT ATCAATCAGT TAAAACATAT TTTTATAGAT GGTGGGGATA TTGAGTTAAA 2640 

SO 

AACTTAAAAT CATCTTATCA TAAATATCAA TCTTAAGTTA GCATTCACGA TAATAGTCAT 2700 

TGTTAACATT AGCATATAAG GTCATGTCAC GTTGAAACAG AGGTTCCTCG GCATTTTTGA 2760 
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TTATTTAATG ATTATTCTAT ATATGATAGT ATAATGAAAT C?rAGATAGGT ATTTAATTTA 2880 

ACAGAGGTGA AATTGAGATG TGGAATTTTA TTAAATGtGT GkTTAAATTC GTATTTAGCT 2940 

^ TAGTTGCTAT TACAACATTA GTTGCTGGTG TTGGTGTAGT AGCATTTGCT TATATCTTTA 3000 

AAAAAGATTT TGAAGATATT GAAAGAAAAA CTAAAGAAAT TATTTCTGAT ATTGAAAGTA 3060 

AAAATAACTA ATAACATTTA GAGGCTGGGA CATAAATCCC TAAAAAACAG CAGTAAGATA 3120 

10 

ATTTTCAATT AGAAAATATC TTACTGCTGT TCTCTATTTn ATcAmTACTt CGTATTGAAT 3180 
GGCTTCGCTT TCCTAGGGTG CCGTCTCAGC CTTGGTCTTC GACTGGCACT GCTCCCTCAG 3240 
GAGTCTCGCC ATTAATACTA CGTATTAACA TGTAATTTTA CTTTGGAAAT ACTTTTAAAA 3300 

IS 

AATAAGACAC TTTGGCCCAA CTTGGCACAT AAATGTAAAA TTCAAT 3346 
(2) INFORMATION FOR SEQ ID NO: 147: 

2Q (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 





GTTGAAGAAA 


GAAATATAAC 


AGTCAATTAT 


AATTATAACC 


TTGTTGAAAT 


CGACGGTGAC 


60 


30 


AAAAAAGTGG 


CTACATTCGA 


ACATATCAAA 


GCATACGATA 


GAAAAACAAT 


AAGTTATGAT 


120 




ATGTTACATG 


TAACACCACC 


TATGGGTCCC 


TTAGATGTAG 


TAAAAGAAAG 


TACACTTTCA 


180 


35 


GATAGTGAGG 


GTTGGGTAGA 


TGTTAACCCA 


ACCACATTAC 


AGCATAAAAG 


CTACTCTAAT 


240 


GTATTTGCAC 


TTGGTGATGC 


TTCAAATGTA 


CCTACTTCAA 


AAACAGGCGC 


ACTATTcGTA 


300 




AGCMGCACC 


TATCGTCGCT 


AATAATTTAT 


TGCAAGTGAT 


GAATAATCAA 


ATGTTAACGC 


360 


40 


ATCATTATGA 


TGGTTATACT 


TCATGCCCTA 


TTGTTACTGG 


ATATAATAGG 


TTAATACTTG 


420 




CAGAGTTTGA 


TTATAATAAA 


AATACTAAAG 


AAACAATGCC 


GTTTAATCAG 


GCCAAAGAAC 


480 




GTaGAAGTAT 


GTATATATTT 


AAGAAAGATT 


TATTACCTAA 


AATGTATTGG 


TACGGCATGC 


540 


45 


TAAAAGQATT 


AATATAATAA 


AGTACAGAAA 


ACAATAAATT 


TTTAATGAAA 


AATCTTTTAC 


600 




TATAAAAGAT 


TAAGTATTTA 


AATGACGTGT 


CAGTGTTGTG 


TTTATATGTC 


GTGAATTTTT 


660 




AGCTCTAAAT 


AGTATAAGAT 


TGAAAAAGTT 


6TTACTGTTT 


TAAATGATCA 


CGATGAAGTC 


720 


50 


ATTCAATAAG 


AATGATTATG 


AAAATAGAAA 


CAGCAGTAAG 


ATATTTTCTA 


ATTGAAAATC 


780 




ATCTCACTGC 


TGTTTTTTAA 


AGGTTTATAC 


CTCATCCTCT 


AAATTATTTA 


AAAATAATTA 


840 
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AGATATTCAA ACCACGTGTA CTCAAAATGA TAGCTTGGTA TGTACCTCCA ATAGTAATTT 960 

CAATAACTTT GTCTGTTGAA CACTAAGAGC AATTTTAATT TCATAATGTG TTGTAAACAT 1020 

S TTTTTTTGAT TGGAGTTTTT TTCTGAGTTA AACGATATCC TGATGTATTT TTAATTTTGC 1080 

ACCATTTCCA AAAGGATAAG TGACATAAGT AAAAAGGCAT CATCGGGAGT TATCCTATCA 1140 

GGAAAACCAA GATAATACCT AAGTAGAAAG TGTTCAATCC GTGTTAAATT GGGAAATATC 1200 

10 

ATCCATAAAC TTTATTACTC ATACTATAAT TCAATTTTAA CGTCTTCGTC CATTTGGGCT 1260 

TCAAATTCAT CGAGTAGTGC TCGTGCTTCT GCAATTGATT GTGTGTTCAT CAATTGATGT 1320 

CX5AAGTTCGC TAGCGCCTCT TATGCCACGC ACATAGATTT TAAAGAATCT ACGCAArCTC 1380 

TTGAATTGTC GTATTTCATC TTTyTCATAT TTGTTAAACA ATGATArATG CAATCTCAAy 1440 

ArATCTAATA GTTCyTTGCT TGTGTGTTCG. CGTGGTTCTT TTTCAAAAGT GAATGGATTG 1500 

20 TGQAAAATGC CTCTACCAAT CATGATGCCA TCAATACCAT ATTTTTCTGC AAGTTCAAGT 1560 

CCTGTTTTTC TATCGGGAAT ATCATCGTTA ATTGTTAACA ATGTGTTTGG TGCAATTTCG 1620 

TCACGTAAAT TTTTAATAGC TTCGATTAAT TCCCAATGTG CATCTACTTT ACTCATGCGT 1680 

TTGATAAAAA CTTAAATAAT ATTAATTCGG TCATCAGTGG CGTTAAATCT TTTATCATTT 1740 

TTAGTTATAG TTGATAAATT TATATTTATA AGCATATATG GATATTTCAT CAAAAATTTT 1800 

TATTTATATA AATCCGAACT GCATACATAT TTGTTTAAAT AAGAGGTATT ATTTTTCGGG 1860 

30 

AAATTGCTGT CTGAGTTAAA AGGATTAGTT TTATAAAATG AGTTGAACTA TAGCCAAAAA 1920 

CGATTAAAAT ACTGATAATC CATTTTTGtA TTATGTTAGG GACTTTTTTA CTTAATTTTA 1980 

ACCCTATTGG aGCmAATATA ATACTCCCTA TTATAAGGAA TAAGOCGTCA TATAAaGGGA 2040 

35 

TATAACCTTG AATAAGTTTG ATGACAAAAG CACCAATTGA AGATATAAAA GCAATTACTA 2100 

TAClfATTAGC GACTACAGTA TTCATTGGTA ATTTGAATAA AACCAATAAT ATAGGAATAA 2160 

^ TAATGAAGGC ACCACCTGCA CCTACTATAC CTGAAATAAT ACCAATGAAA AGGCCAATGA 2220 

TAACTAATAA ATATTTATTA AATGAAGACT TTTCGGAACT AGGTTtCACT TTAATAAACA 2280 

TTAATGTTAA TGCAAGTAAA 6CAATAATGA TATATACCGT ATTTACAAAT GTAGCATCAA 2340 

45 ATAAATTTGC TAGAAATGCA CCTAACATAC TCCCT 2375 

(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 6115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

GAGGTTTCTA GACAAGCTTT TAATAACTTA CCAAACTCAT TAAgrTGGTT gTGtTGGACT 60 

^ GCCtATTATC mAAGtATTAT GaGTTGTTTA ATATTAGtGC TAArACATAC GAAGAGTGGT 120 

TTAAACAATT TAGTAGTAAG AAAGCACAAT TCAGTATTAA TCTCACGGAT AAATGGATAA 180 

TTCAAATCGC ATATGGTAAA TTAATAATAA TGGCTAAAAA TAATGGCGAT ACATATTTTA 240 

10 

GAGTTCAAAC AATTAAAAAG CCAGGTAATT ATATTTTTAA CAAATATCGA TTAGAGATAC 300 

ATTCTAATTT ACCAAAATGT TTATTTCCGC TTACAGTGAG AACACGACAA AGTGGCGATA 360 

CATTTAAACT GAATGGGCGC GATGGTTATA AGAAAGTGAA TCGCCTGTTT ATAGATTGTA 420 

IS 

AACTGCCACA GTGGGTTCGG GATCAAATGC CAATC6TATT GGATAAACAA CAGCGCATTA 480 

TTGCGGTAGG AGATTTATAT CAACAACAAA CAATAAAAAA ATGGATTATA ATTAGTAAAA 540 

20 ATGGAGATGA ATAGC6TTAT GCATAATGAT TTGAAAGAAG TATTGTTAAC TGAAGAAGAT 600 

ATTCAAAATA TCTGTAAGGA ATTGGGAGCA CAATTAACAA AGGATTATCA AGGTAAACCA 660 

TTAGTATGCG TGGGTATCTT AAAAGGCTCA GCAATGTTTA TGTCAGATTT AATTAAACGA 720 

2S ATTGATACCC ATTTATCAAT TGATITCATG GATGTTTCTA GTTATCACGG AGGCACTGAG 780 

TCAACTGGTG AAGTTCAAAT CATTAAAGAT TTAGGTTCTT CTATTGAAAA TAAAGACGTA 840 

TTAATTATTG AAGATATCTT AGAGACTGGT ACTACACTTA AGTCAATTAC TGAATTATTA 900 

30 

CAATCTAGAA AAGTTAATTC ATTAGAAATA GTTACTTTAT TAGATAAACC AAACCGTCGT 960 

AAAGCGGACA TTGAAGCTAA GTATGTAGGT AAAAAAATAC CAGATGaATT TGTTGTTGGt 1020 

TACGGTTTAG ATTATCGTGA ATTATACCGA AACTTACCAT ATATCGGTAC GTTAAAACCT 1080 

35 

GAAGTGTATT CAAATTAATT TTTTAATCAA TTTCAGTTAT TATTACTATG CGTTTGAGAA 1140 

ATAATAGTGT AGACTCAAAA ATATGAAAAA TGTATTTCAT ATATATTTAA TTTTAGACAA 1200 

GACATATGTC TT6AAAAGTT GAAAAATATA GAGATTGATA AAACTAATAC GGGTGTGAAT 1260 

40 

GACATTGATG TTAAGCTCAA TTACTAGCTT ATAAAACATG TCATATGTTA CAATTTTTGT 1320 

TAGTTTTATT ATGGGAAGTA GGAGGAAATG ACGCATGCAG AAAGCTTTTC GCAATGTGCT 1380 

45 AGTTATCGTA ATAATAGGCG TTATTATTTT TGGTCTATTT TCATATTTAA ACGGTAATGG 1440 

AAATATGCCG AAACAGCTTA CATATAATCA ATTTACTGAG AAGTTGGAAA AAGGTGACCT 1500 

TAAAACTTTA GAAATCCAAC CACAACAAAA TGTCTATATG GTAAGTGGTA AAACGAAAAA 1560 

TGATGAAGAC TATTCATCAA CTATTTTATA TAACAACGAA AAAGAATTAC AAAAAATTAC 1620 

TGATGCTGCT AAAAAGCAAA ACGGTGTAAA ATTAACGATT AAAGAAGAAG AAAAACAAAG 1680 
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TTTCTTCCTA AGCCAAGCAC AAGGTGGOGG TAGTGGCGGT CGTATGATGA ACTTTGGTAA 


1800 




ATCTAAAGCA AAAATGTACG ATAATAATAA ACGTCGTGTT CGTTTCTCTG ATGTAGCAGG 


1860 


5 


GGCAGATGAA GAAAAACAAG AATTAATTGA AATTGTTGAT TTCTTGAAAG ATAATAAAAA 


1920 




ATTCAAAGAA ATGGGATCTA GGATTCCTAA AGGTGTCTTA CTTGTTGGAC CTCCAGGTAC 


1980 


in 

1U 


TGGTAAAACA 


TTACTTGCTA 


GAGCGGTTGC 


AGGTGAAGCT 


GGCGCACCAT TCTTCTCTAT 


2040 


TAGTGGTTCA 


GACTTTGTAG 


AGATGTTTGT 


TGGTGTTGGT 


GCGAGCCX3TG TTCGTGACIT 


2100 




ATTCGATAAT GCTAAGAAAA ACGCGCCTTG TATCATCTTT ATCGATGAGA TTGATGCTGT 


2160 


IS 


TGGTCGTCAA 


CGTGGTGCAG 


GTGTTGGTGG 


CGGTCATGAT 


GAACGTGAAC AAACCCTAAA 


2220 




CCAATTATTA GTTGAAATGG ATGGTTTCGG TGAAAATGAA GGTATCATTA TGATAGCTGC 


2280 




TACAAACCGT 


CCTGATATCC TTGACCCAGC 


CTTATTACGT 


CCAGGTCXSTT TTGATAGACA 


2340 


20 


AATTCAAGTT 


GGTCGTCCAG 


ATGTGAAAGG 


CCGTGAAGCA ATTCTTCATO TTCATGCTAA 


2400 




AAACAAACCA 


CTTGATGAAA 


CGGTTGATTT 


AAAAGCAATT 


TCACAACGTA CACCTGGTTT 


2460 




CTCAGGTGCr 


GATTTAGAGA 


ACTTATTAAA 


TGAAGCATCT 


TTAATTGCTG TACGTGAAGG 


2520 


25 


TAAAAAGAAA ATTGACATGA GAGATATCGA AGAGGCAACG GATAGAGTTA TAGCCX3GACC 


2580 




TGCTAAGAAA TCTCGAGTTA 


TTTCTAAGAA 


AGAACGTAAT 


ATTGTTGCTC ATCACGAAGC 


2640 




TGGTCATACA 


ATTATCGGTA 


TGGTACTTGA 


TGAGGCAGAA 


GTAGTGCATA AAGTTACTAT 


2700 


30 


TGTTCCACGT 


GGACAAGCAG 


GTGGTTATGC 


AATGATGCTA 


CCTAAACAAG ATCGTTTCTT 


2760 




AATGACTGAA 


CAAGAGTTAT 


TAGATAAAAT 


CTGTGGTTTA 


CTTGGTGGAC GTGTATCAGA 


2820 


35 


AGATATTAAC 


TTTAACGAAG 


TATCAACAGG 


TGCTTCAAAT 


GACTTCGAAC GTGCAACACA 


2880 


AATCGCACGC 


TCAATGGTTA 


CGCAATATGG 


TATGAGTAAA 


AAATTAGGAC CATTACAGTT 


2940 




CGGTCATAGC 


AATGGTCAAG 


TATTCTT/WjG 


TAAAGATATG 


CAAGGTGAGC CTAATTATTC 


3000 


40 


AAGCCAAATC 


GCATATGAAA 


TTGATAAAGA 


AGTTCAACGA 


ATCGTTAAAG AACAATACGA 


3060 




ACGTTGTAAA 


CAAATTTTAT 


TAGAGCACAA 


AGAACAATTA 


ATTTTAATTG CTGAAACATT 


3120 




ATTAACAGAA 


GAAACATTAG 


TTGCTGAACA 


AATTCAATCA 


TTATTCTACG AAGGTAAATT 


3180 


45 


ACCTGAAATT 


GATTATGATG 


CAGCTAAAGT 


TGTTAAAGAT 


GAAGATTCTG AATTTAATGA 


3240 




TGGTAAATTC GGTAAATCTT ATGAAGAGAT 


TCGTAAAGAG 


CAATTAGAAG ATGGACAACG 


3300 




TGACGAAAGT 


GAAGATCGTA AAGAAGAAAA AGATATTGCT 


GAGGATAAAA AAGAAGCTGA 


3360 


50 


TAAATCTGAT 


GAAAAAGATG 


AACCAGCACA 


TCGACAAGCC 


CCAAATATCG AAAAACCTTA 


3420 




CGATCCAAAT 


CACCCAGACA 


ATAAATAATC 


GATTATATTC 


AGTACCTCTT TCTATGATAA 


3480 
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AATTGTTATA GCAGAAAATA ATTGTAAAAC AAGTTACTTC ATTATTTAGA ATGATGGGTG 3600 

TAGAATAAGT ACAATTGrrG CATTTTATGA AGTAAAGTAA TTTTTTAAAT ATAGAGTAAT 3660 

5 AGAGGAGATT GAAATAATGA CACACGATTA TATTGTTAAA GCATTAGCAT TTGATGGAGA 3720 

GATTAGGGCT TATGCTGCTT TGACAACTGA AACTGTTCAA GAAGCACAAA CGAGACATTA 3780 

TACATGGCCG ACAGCATCTG CTGCAATGGG AAGAACAATG caCAGCAACA GCTATGATGG 3840 

10 

GCGCAATGTT GAAAGGTGAT CAAAAATTAA CTGTCACTGT AGATGGCCAA GGACCTATTG 3 900 

GACGAATTAT TGCCGATGCA AATGCTAAAG GCGAGGTGCG TGCTTATGTA GACCATCCAC 3960 

AAACTCATTT TCCATTAAAT GAGCAAGGTA AACTTGATGT AAGACGAGCG GTAGGGACAA 4020 

75 

ATGGATCTAT TATGGTTGTT AAAGACGTTG GAATGAAAGA CTATTTCtCT GGAGCAAGTC 4080 

CaATTGTTTC AOGAGAACTT GGTGAAGATT TTACTTATTA TTATGCTACA AGTGAACAAA 4140 

2^ CACCTTCATC GGTAGGTCTT GGTGTATTGG TAAATCCTGA TAATACGATT AAAGCAGCAO 4200 

GAGGATTTAT CATTCAAGTT ATGCCAGGTG CCAAAGATGA AACAATTTCA AAATTAGAAA 4260 

AAGCAATTAG TGAAATGACA CCAGTTTCTA AATTAATTGA ACAAGGATTA ACGCCAGAAG 4320 

2S GATTACTAAA CGAAATCTTA GGTGAAGACC ATGTGCAAAT TTTAGAGAAA ATGCCTGTTC 4380 

AATTTGAATG TAATTGTAGT CATGAGAAAT TTTTAAATGC TATTAAAGGA TTGGGCGAGG 4440 

CTGAGATTCA AAATATGATT AAAGAAGATC ATGGTGCTGA AGCAGTATGT CATTTCTGTG 4500 

on 

GAAATAAATA TAAATATACT GAAGAAGAAT TAAACGTGTT GCTAGAAAGT TTAGCGTAAT 4560 

TTAATTTAAA TCAATACGCT AAAATGTTTA TTTTTAGCGG TTTAGTGAAA TGTAGAACTA 4620 

AATAGTTGTA TAATCCTTAG TGATTTTGTT TGCTTTCTAG AATTTATTTG ATAAAATAAT 4680 

35 

TCTATATCCG ATAAATAAAC TAAGATTTCA ACAACTAACT AAAAAGGAGT GTTCTTAATG 4740 

GCAmAAAAC CAGTAGATAA TATTACTCAA ATTATTGGCG GTACACCGGT AGTCAAATTG 4800 

AGAAATGTAG TAGATGACAA TGCAGCAGAT GTTTATGTAA AATrCGAATA TCAAAATCCA 4860 

40 

GGTGGTTCTG TAAAGGATAG AATTGCTTTA GCAATGATTG AAAAAGCAGA GCGAGAAGGC 4920 

AAAATTAAAC CTGGCGATAC AATTGTAGAA CCAACAAGTG GTAATACAGG TATCGGTTTA 4980 

45 GCATTTGTAT GTGCTGCTAA AGGATATAAA GCAGTATTTA CTATGCCCGA AACAATGAGC 5040 

CAAGAGCGTC GTAATTTATT AAAAGCATAC GGTGCGGAAT TAGTTTTAAC GCCTGGATCA 5100 

GAAGCGATGA AAGGTGCAAT TAAAAAAGCT AAAGAATTGA AAGAAGAACA TGGTTACTTC 5160 

GAGCCACAAC AATTTGAAAA CCCTGCX3AAC CCTGAAGTTC ATGAGTTAAC TACAGGTCCT 5220 

GAGTTATTAC AACAATTTGA AGGGAAAACT ATCGATGCGT TCCTAGCTGG TGTTGGTACT 5280 

55 



760 



EP0 786 519 A2 



GTTGCTATAG AGCCTGAGGC TTCTCCAGTA TTGAGCX3GTG 
TTACAAGGTT TAGGTGCTGG ATTTATTCCA GGCACTTTGA 
^ ATTATTAAAG TAGGAAATGA TACAGCGATG GAAATGTCTC 

GGTATTTTAG CAGGTATTTC ATCAGGTGCT GCGATTTATG 
GAATTAGGAA AAGGTAAAAC AGTAGTAACA GTATTGCCGA 

10 

TCAACACCTT TATATTCATT CGATGACTAA TTAATGTCAT 
TTTGAGATAA CTTGCTCTTT TTTTCTACCA TGTATATTTT 
TATU^CATTTT TCTGATAAAA ATATCCAGTG AATGATAAGA 

IS 

AACTAGTAAA TAGCAGGAGT AAATTTTATT AGAGTTAAAC 
TTAACATGAC TAAAACAAAA ATTATGGGcA TATTAAACGT 

20 ATGGTGGAAA ATTTAATAAT GTTGAATCAG CTATAAATAG 

AAGGTGCTGA CATTATAGAT GTTGGAGGTG TTTCAACGAG 
CATTAGAAGA TGAGATGAAC AGAGTATTAC CTGTTGTTGA 

25 (2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 01 base pairs 

(B) TYPE: nucleic acid 
3^ (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

35 





TAGATACTGG GziTAAAcaTc 


AAAAATAtyT 


GCtTaTTCaC GTGTTTAcGc 


TCCCtCAAAC 


60 




GCAACGTTAA TTGCGTGTAA 


TCATTTAGTG 


TGAATTcAGA CGCTTCTTCC 


ATGACTATGT 


120 


40 


CTGATATGCC TTTTATCGAC 


TTTATTTTCT 


CTGGGTTATC TAATCCTTTA 


AACAAAAAAA 


180 




CTGCGCCGTT TGGCAATTCA 


ACTTTGTTAT 


CAGTCTTATT CCAAAGGCAC 


ATGTCCCAAA 


240 




TACCAAAGTT TATCAAACAA 


TCTTTAACAT 


CTTCGAACAA ACTATCTTTA 


ATTGTTGATT 


300 


45 


GTACTTTTCT AAGCCACAGT 


ATACGCCTAG 


GATATTTCCA ATCTTGCAAT 


GCTTTGAGTA 


360 




CAACTTTTTG TATAACGCCG 


TGAGACTTAC 


CGCTCGAACC TCCACCGTAA 


TGkACTTCAG 


420 




TGAAGTtATC GTAATTGGTT 


AGTATTTCGA 


ATATGTTTCT ATTGAAAACA 


TTAGACGGTT 


480 


SO 


TGTTAAAGTT TAATTTAACT 


TTCGTCATCG 


TACTCACCAA TATTAATCTC 


AATATTCTTC 


540 




TGAGTAATTT CTTTTTTATC 


GATATACGCA 


CCATGTACTT TTAGTATGTG 


GTCAATAGAT 


600 



GTGAGCCAGG TCCACATAAA 5400 

ATACAGAAAT CTATGACAGT 5460 

GTCGAGTTGC TAAAGAGGAA .5520 

CTGCCATTCA AAAAGCAAAA 5580 

GTAATGGTGA ACGCTACTTA 5640 

TTAAAAGAGT GAGTTATCTT 5700 

TAAAAATATG AGCGTTAAAT 5760 

TAATAAACGT ACATACTAAT 5820 

AATACATAAT TAAAGGGTGG 5880 

CACACCTGAT TcATTCTcAG 5940 

aGTGAAAGCC ATGATAGATG 6000 

ACCCGGTCAT GAAATGGTTT 6060 

AGCTATTGTC GGTTT 6115 
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TTTAAATGGT CATATTTCTT ACTGTAAGCC TCTTGAGGTT CTCCTCTAGC AATAGAAGCA 720 

GATAACGCTA AAGCTTCTGT AATACTCATT AAACGCTCTT CTTGTATCTG TTCTAATCGT 780 

5 TCTTTAATAT ATTCCGAAAC ATTAACATTT CTTAACAATC GACTTGCTAA AGACTCTGCT 840 

GTTTTCTTAC TATAACCTGC TGTAATTGCT GCTTTTTTAC CATTACATCC ATTCATTATA 900 

TATTCATCTG CGAATCTCTT TTGTTTTTCG TTCATTTCAT TTACCACCAA CTCTCGCGCT 960 

10 

ATACGCTTTT TAAAATTAAA AAAGGATTGG CTATAATCAG CCAACCCACA TAGATCCTTT 1020 

AITCCTAATT GCGATAAGGG AAACGCAGTA CGATAGTCAA TATCCTACAC TATCATAATA 1080 

TCTCATTTAA GGTATCAAAA ACTGCCACTT TACTGCCAAT TTCAGTCTTC CCCTAACTCT 1140 

IS 

TCCGCCAATC TAGATATGAT TTTTCTTTTG ATTCTATQAG CAGTTCTATC AOAAATGTGT 1200 

ATGTCAACAC AAACTTTCAC TAATTCCTTT TTATTAAAAT AATACTCTTG AATGAATTCG 1260 

CGTTCTTTCC TGCTTGATGT GTTGATTATA CGTTCAATAG CGCTCTTAAA CTCAAQGATT 1320 

TTACCTCTTC GTATACTACA AAGATAATTA GTTACTGCCA TTTCTGTTTT CGATGTATTA 1380 

GACGGTACAA ACTCCCCGCC TATATTTGTA TCTGTTGGAA TCCACXK5TGT CATTATTTCA 1440 

25 CTTCTTAAAT CTTCAAGTTG TTTATGATAA TTAGGATAAT CACACAACTC ATCTTCTAAC 1500 

TTTCGAACTG TTGATAATTT TAATCCGTAT TTCTTTTTAG TCATGAATAC CCTCCX3TACA 1560 

AATATGTTTA ATCTTCAAAG TGTCTCAATC TACTTCTTAA TATCTCTATC TCTCGCTCTT 1620 

30 TAACTTTTAC ATCACCTTTT AACTGTTCCG CTTGTAACAT CACACCAAAC AATAAGATGA 1680 

CTAGTAATAT AATTGCTATG ATTAACCACA TCATCTACTC CGACACCTCC GCCCTCATCA 1740 

AATCAGACTG ATCACTCAAC TTTGCGAAGT CACTTGGCGC CTCTACATCA TCATTAGCCG 1800 

35 

TCATCATAAT ATATACTTGC TCAGTTACAT ACTTACCTAA CTCATACATC GCTAGTAAGA 1860 

ATAASAGTCT CAAAATTTCT TTAACCACCA CTAAACACCC CATGTTAATT TATCGATAAT 1920 

TTGTATAGCT TGTTTTAATG CGTCTCTTTT TTCTTTGATA TCTCTATTAT CGCCATCTTC 1980 

40 

ATCAGCTGAC ATTAACTCAC TGTCATATTC ATATAATAGT TCT6ATATTT CATTACTAGC 2040 

TACTACTAAT AAGTTTTCAT CTACATCAAT CGTTACCGTT TTCTTTGGCA TCTCCATCTC 2100 

^ TCCTTATCTT AACTTGTGCC TCX5TATTTGC GCTCAGCTTC TTCTTTACTC TCTGCCTCAA 2160 

CAACTGTAAA CGTCTGATTA TCTCTAGCAG TAGTAAAATG TTCATGTGGT TGTCCTGTTG 2220 

AATCTTTGAA TGTTGTGACT AAGTATTGCG TCACTTCTTA TCACTCCTTT GAATGATTCT 2280 

SO AAGTTTTTCT ACGAATAAAA GTATTAGTAC AACACTCAAT GTAGCCAACA TATTTTTTTG 2340 

CTTTGCAAAA TCTACTATAA CGATTAAGAC TAATAACATT CCAATTCTGC ATGTAAATAA 2400 

55 



762 



EP 0 786 519 A2 





TACAAGTATT 


GGAACTAATG TAATGATGTA ACTCACTTCC 


CCAAAACCTC 


CTTGACTCGA 


2520 




TCTAAGATGT 


CTTTACACTC CGCTACTTCC 


GAAGCCTTTT 


TCTCCACGTT 


CTGAAACACT 


2580 


5 


TTCGAATTCC TCCACTTGCT TTAGTTCAGG TGTCCATATA 


GGCACX3ATAA 


CCAATTGAGC 


2640 




TAGTTTGTCT 


CCTTCGTTGA TTTGATAAGT 


TCCGTATTGT 


CTTATGGCGT 


CACTCAAATC 


2700 




GATTTCTCCT 


TTAATATCAA AAACACCTGG 


TGTGATATAA 


CCATTCGATG 


CAATAGCGTC 


2760 


70 


ATTCTTGATA 


TTAATCCCTA AATTGCCGTG 


ATATCCCGCX3 


TCTATCTTGC 


CTGTTTCAAT 


2820 




CACTAAATGC 


GTTTTACTAC TTACACCACT 


ACGGCTAGTT 


AATAGTCCGA 


CATAGCCCTC 


2880 


IS 


TGGTATGCTT 


ACAGCTACAT CTGTTTTAAT 


CACTGCCTTT 


TCTTGTGGCT 


CAAGTACGAC 


2940 


AGTTTCAGCT 


GAGAATATGT CATAACCTGC 


ATCCGTCTTA 


TGATTTCGTT 


CGGGCATTCT 


3000 




AGCATTTTCT 


GATAATAGCC TTACTTGTAA 


TGTGTTAGTC 


ATTTTCCTGC 


TCCTCCCTAG 


3060 


20 


CTGTAGCAAA 


CGCTATTCTC AATTTCAATC 


TTTCAACAAT 


ATGAATTAGT 


GCGGTATTGA 


3120 




GGAATATTTC 


AAATTCTTCA ATGTTCTCAT 


CTATAAAATC 


AAGTATTTCT 


TCCTCTTGTT 


3180 




CACTGTCAAA CTCGCTTAGT ACATCCCAAA 


TATTTATGTC 


GCTTTTGCTC 


GTTTCTAATA 


3240 


25 


CTCTITI'GAT 


TATTTCTGAA TTACTTTTAT 


TACTCATTTT 


CCTTGTTCCT 


CCTCATATTT 


3300 




ATAGACAACT 


TGACCTGCCA TAATCCCTAC 


TGCTTCATCA 


AGTTCAATAC 


CTTCTTTAAC 


3360 




TGAATGTTGA ATAGCATTTG TCATTCCCTC AAGTATTTCA 


TCAAACGCTT 


GTGCTCTCTT 


3420 


30 


ATACACGTCC 


TCAATCTCTT TTAGTAATCC 


CTCTGTGTCA 


TTACCGTTAT 


ACGCACTAGC 


3480 




ACT6ATCACT 


GATTGTTCAA TTTGTTCGCG 


GTTATTCATC 


ATTTCCATCT 


CCTCTAAAAT 


3540 




AAAGTTAGTT 


GCTTCTGCTC CTCGTATTCC 


AAACCATGTT 


GCTTTATATA 


TGTTTCGAGC 


3600 


35 


TCTTCCGCTG 


TATCAAATGT CTTTTTCACG 


CCTTGCCAAC 


CTGGCACGAT 


ATGCCCATGa 


3660 




AAGT&ATAAG 


TGCCGTTCAC TACATGGATA 


TGTGCCACTC 


GTTCGTTATC 


CTGATACAGA 


3720 


40 


TATCTCTTAG ATCCGAAAAA TTGGTTTAAG TATTCTTTAC 


ATGCGCTATC 


GGTTTTAGGC 


3780 


ATTTATGCTT 


CCTGCCATTT CTTAAACATT 


TGGTTATAAG 


TA6TATCAAA 


CCAGTACGGA 


3840 




TCACGTGAAT 


GTTTTTGAGG CACATTAAAC 


AAATGTGGCT 


TCTTCTTACG 


TAGTTCAGCC 


3900 


45 


TCTTTACGTC 


GTTGCCTAGC CATTTCACGC 


TCTTTGCTCT 


CTCGCTCCAT 


GATTTTGGAT 


3960 




AACACAATTT 


CTTTATACTC AGCTAAGCGC ATACCATAAG 


GTGCATGTAA 


GGCTTCTAAC 


4020 




AACGCCCAGC 


CACCTCGTAC TCTTTTTGCA ACCATTCCTG 


GAGTTAAACC 


GTTCTTTTTT 


4080 


50 


ATCAATTCAT 


TTTCATGTTC GGTAAATTTA 


TATGGTTTAC 


CGTTAATCTT 


TACGATACTC 


4140 




ATTTATTCCA 


CCTCTATACA TTTACTTTTT 


TTAATCCAAT 


CCTCTAATTT 


GTGCGTGTTG 


4200 



55 



763 




764 



EP 0 786 519 A2 

AGACTAAAGA AAGATGTTTT GTATCCATTT TGTGCTATGT TCAGCATCAT GTTTAATGCA 6120 

AAACCTGTCT TACCCACTGA GGGACGCGCT GCX5ATGACGA TTAATTGTGA TGGTTCTAAT 6180 

5 CCCCCTATTT TGTAATCCAT TAGCTTGTAA CCCGTCTTAA TTTGCTTCTT AGGGCTATCG 6240 

CTGTATAACT CTTCGACAAA CTCCTCAACA AACTTCTTGG TTCCATCTTC TTTTTTGTTA 6300 

GTAATTGTTT TTAAATCCTT GAGTTCATCA ATCAAGTTGT TAAAGTTTTG GTTCGTAGGT 6360 

10 

TGTTGTTTGA ACTCAGTTAC CAATTCGTTA GCTTTGTTGA GCTGATAACT TTCCAATAAT 6420 

TCTTGTTGAT AACGTTCAAA GAAGCCATAT CCAATGAAAT CGGAGTTGTA AAGTTTAGTT 6480 

ATAGTATCTG CATCTAAAAA TTCTTTATCT TTAGTTGCTT TTAAATAGAT TTCTTGATGA 6540 

IS 

TCTATCTTTC CGACGTCCAT TACATAATTG AAAAAGGTTT TAAACTTTTC GTTCGTAAAC 6600 

ATGTAATCTT TAACTCTTAT CTTTTCTAAT ACGTCCGGTT GTTTAAGTAG CGTAGCGATT 6660 

20 ATTGTACTTT CAATTTCGAA TTGTCCGTAA TTCATTCXSTT TTCGCCCCCA AATTCTGCCA 6720 

ACTTATTCAT GAACTTATCT AGCGCTATTT TTCTTTGTCT GACATATTCG GGGTCATTCT 6780 

GCATTTTCCA TTGGTGT6TA GCGGTTTCGT TATCTACTGG CTCGATAGAT ACTTTTTTAG 6840 

2S 6TTCCTTACG CATGATTGCT GGTAAGTTAG GCGGGTACXK3 GTTGTTACTG TTGATATAAA 6900 

CATCTACCGC TTTTACAGTT GGTTGATAAT CTCCATTTTO ACTTAATACA TCAATCCACA 6960 

TTTCTAACTT CX3GTTTATCA AAATCAATGT TGTATACGTA CCTAACTTTT TTAATAATTT 7020 

CTAATGCTTG TGTTTTGCTC ATCGGCATTA GTCATCACTC AATTCTTTTT CCATTTGTGC 7080 

AATGACATCA TCAGTAGTAT TITTTCTAGG TGCTATTTTA TTTTCTGCAT CTTCTTTTGT 714 0 

TTTGACATTC TCTTTAGCCC AGTTGTTTAA AACTTTAATT AAATAGCCAC CATGCGCACT 7200 

35 

TTTGCTTTTA GTGTACTCAA CACCTACTTT TACAACTTCA AAAGCGTTTG TACCTATATC 7260 

ATCAATAGCA AACCCTAATT GTTCCATTTG ATTAGGTGTT AACTTATCAT CCAAATTTGC 7320 

AATTATATAT TTTATTGAAG ATGAGAAGAC GGCTTCTCTT TCTTCTTCTT TATTCTTATA 7380 

40 

TTCTTCTTCT TTTTCTTCrr CTCTTTCTTC TTCTTCTTCT GTATCGTTAC QTAACOTTAC 7440 

GGTAACGTTA CGTTTTGCTT CTAGTAACTT TTTCTGTTTC TCACGATAGC GTTGTTGTCX3 7500 

^ CAATTTATTT TTTTCTTTAT GCTTAGCTTT GCTATCTAAG CTTTGATGCT TCTCCCAGTT 7560 

TGTCACTTTT ATGACACCAT TAACTTTTTC AATCATGCCC AATGTCTCAA AAGTTTGAAT 7620 

TGCTAACCTT ATTGAGTTAA TAGGTCTATT AAATTCATTT GCTAACATTT CTTCGTTGTA 7680 

SO CXX3CAAGTTT TCGGATAGCA TAATATAACC TTGTTCATTG TACTTTCCTG ATAAAGTTAG 7740 

TAACTTAACC CAAATAGTTA TGATCGTATC TCTTTCGGGT AAAGCTTCQA TATATTTQAT 7800 

SS 
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IS 



20 



2S 



30 



35 



40 



45 



SO 



CTCCTTTCAG CATTTTGTTG AGCCTCTCAT CAACTTTTAT CCACGAGTCA TGCAAGTGAT 7920 

ATTTATCATC AAACGACTTA ACGCCAATTG CGTGCTGTTC ATTATGATGT TGTCTACACA 7980 

GTGCTAACAC ATGTTTGTCG TAGTGATTCA TTTTGTTTCT GTTCATGCCT CTGCCGACTG 8040 

CTTCATAATG TGCCAGGTCT GCGTGAGGCT TTCCGCATAT TACACAGTTG CGGTTGATTG 8100 

TAGCCCAATA TAATAACGCT TTATCTTCGC TTAACAACTT ACTCGTTTCT ACACTCATAG 8160 

GTATTTGATG ATGAAACATA AACGCTATAA TCAGTTCTAT TAACTCCCTT GCAACTTTCA 8220 

TAGAACAGTC GCGCAGACTG ATTTCTTCAT AACCTTTCAT AATTTCCAAT TCTGTTTGTA 8280 

ATAATTTTCT AGTTGATTCT ACTGGTTCGC CCCAGTGAAG TTCTATATCT CTACACATTG 8340 

CGAATATTTT TTTGCGTTGT TCTATAGATA GTTTTTTATT GTCCGGAACC TCTACTTCTG 8400 

CTTTTAGTGG ATATCCGTTT TCTAGTAAGT CAATGTGACT TTGTTCAAGT TCAACACCAG 8460 

TAGCAACGAC GGAATAAGTA CCGTCATTGT CTTTCTGGTA TCTTGTAATC TATTGCATTT 8520 

AAACCACGTC CTAGAACGGT AAATCATCAT CATTGATTTC TATTGGACCA TTAGCATTAG 8580 

CGAATGGGTT TGATTGTTGA CTCATTGGCG TCTGTTTCCC ATTTGCTTGC TGTT C TTTTT 8640 

GTTTCATCTC ATCAGTTTTA GGTTCTGGTT TATTAACTAC TTCATCGTCT TTATTCCAAA 8700 

CTTTTACATA TGAGAGTCTT ACAAAATACT TGCCTTGTTC CTCGTTAAAT TTATTTTTAA 8760 

GTACAATAGT TCCGATTTTG TTAATTAATT GATCTGTGTC AAAAGTTAAA TCTGGTAAGT 8820 

TCAATTTAAT TCCTAATCTA CTAAGTAACT CGATATATTG TTTTTCTTGA TAATCTTGTT 8880 

GGAATGGTGG GACGAATTGG TTGTGTTTGT ATTGTTTACC TTCGTTGTTT TCAAAAACAA 8 940 

TCX3TGAAGTA TCTGTTTTCT CTGTCGTTAA ACTCGACATT TGCAACTTTT ACTGTAAATT 9000 

CTCCAGCTCC TAAAAAGTCC CCACCTTTCA TGAATGCCTC TTGATTAGTT TCTTGAATGT 9060 

ATTGTGTTCT ACCAGTGATT TTCATAATTT TTATACCXSTC CTTTTAATTA ATTTTTAATT 9120 

ACCATTTCTA ATT6CTTGTA CAACATCGTT AATACTTGGA TTAATGAAAC GTTTGTTGTT 9180 

AATTTTGATG TTGCTTGAGT GTCTTATCTT TGTCTCGAAT AAATTT6ATG GTTCAGCGTT 9240 

AAGTACATAT TGATAAGTTT TTTCGCCGTC TTGCTCATGT TCTTCTATTG TCATTCTTGC 9300 

TAACACX3TCA GATTGACTGA TGACTGCTTT TTTTATTTGG TCTTGTGCCT CTATCGTGAT 9360 

TGTTGGATTG ATAGTACTTC CCTCATCATC TTTGTCTTTG TTAATGCCCT CGTGTCCGCT 9420 

TATAGCAAGA TGAAATTGAT AATGTTCTTG TAATTTAGAA ATATAACGAT AAATACTTAC 9480 

AATGCX3TGTA GCACACTCGC CCCAATCATT AAATGTCGGT TTCTTTGATT TACCGTCCAT 9540 

GATGTCGTCC ATAGTGATAT CACGTAACTT TTGGATTGTT TCAATCACTA CAACATCAAT 9600 
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AAAATGCTTA TAATTCTTAA TCTGCACAAC TGCCCCATCT TCTGTTACCG TTGTTCCGTC 9720 

CTCATTTATA TCTAGTACTA AGGCATTGTT ATCTTTTGTT AAAAACGTAG TTTTACCAGT 9780 

ACCGAACTTG CCGTATATCG CAAATTTATA AAACTTGTTT GCATTTTGTT TGCTGATGTC 9840 

TTTTACACCT AG1TGCX5TTA AAATATCGAC ATCTTGATTA GTTTTTTCAG TCATCTATTC 9900 

TCCCACCTTT ACCGTGTATG ACGTTGGTTT CTCCACAATG CTAGCACCCT CTAAAACTTC 9960 

GCCGTTTGCG TCAATCAATG TGCCGTTTTC AGTTACATTG AAATCTTTCT TAATGTCTGA 10020 

TTGGCTAAGT TTTTTAGTTA CTTTTACATA GTTGTCAAAA CCTCGTTGCT CAAGTTGTnT 10080 

AATGACTTCT TOCTCATTGC TAACTTGAAT GACTTTTGAA CCTTTTCTGG CT6TCACTTT 10140 

TCCGTAAGtG TATTCAACTT GAATTTGCTA TCTTGTTCTT TTTGTATTCT GTAATATTCA 10200 

ATTACAAGGC TTTGTAAATA TTCTTTQCCA CTCTGTAATT TTTCTACTTC TTTATCTTTC 10260 

2Q CATTCGTTTA TGCGTTCAAT TTCTTTATTT GCTAAATCGT TGATTTCATT CTCTTTAGTT 10320 

GTGATTGCAT CCAGTTTCTn AAAAACCCAG TTAGCACTGT CTAGATCAGT nACTTTGAAT 10380 

CGGTCGTCTT GTTCGAATGT n 10401 
25 (2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2989 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
^ (D) TOPOLOGY: linear 



35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

TTTCTCTCTA TTATTCTCGA TGCGTAGATA ATTGTTTAAA TTTAAGTTTA TAGTAATGTT 60 

GAGTHTATAA TTTCATATAT CTAAAAACAG GTGTTGTATA TATAATCATT CATCTAGTTA 120 

TACTTACTTT AAAAATAATA TAATTTCATG CGATGCAATT CATTGATGGA TGTTTTTAAT 180 

CTTAATCAAA TCCAaATAAA GCATATATTT TTAAATTCAC TTTCTTTCGA ATOGATTTTT 240 

ATCTCTTGnA TTAAACTTTT CCATTGTTTC ATTAAAGCTC TCTGTCATAT CTATTCCCAT 300 

^ TGAATTCGCT AAACATAACA ACACAAATAA ATTATCACCT AATTCTGCTT TAATCGTATT 360 

TGCTTCCTCT GAATCTTTCT TCTTTTTTTC ACCATAGGTA TGATTTATTT CACGTGCAAG 420 

TTCGCCCACT TCTTCAGTCA ATCTAGCTAA GTTAGCTAAT GGTGAAAAAT ATCCTGTTTT 480 

50 AAATTGTCCA ATATATTCAT CAACTTCACG TTGCATTTCT ACCATTGATT TCATTTCTAC 540 

GTTCTCCTTA TATTGCATTT CTAATATAGT ATATATCAAT TTGAAGTCTC ATGCATGTTT 600 
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AATTCAGTTT ATATAAATGT AATGCATTCC TAACTAAATT AAATCAATTG AAATTGGGAT 720 

TATAACTTTA TGATACGTAC CACTACAATA AAATAATATA GTGAATAATC TACCATTAGA 780 

AAAATAAGCA CAAAAAAACT AGCAACCACA CAAAAATGTG ATTAGCTAGT TAATAAGTGT 840 

CTAATTTAAG TTAATTGTTA ATCTATAAGA TTAATCACTT GAACGCGCAA TCAAAATAAT 900 

ACGTACAAGC TCTGCTACAG CGACTGCAGT TGCTGCAACA TAAGTCATTG CTGCTGCAGA 960 

TAATACTTTA CGCGCATGCT TGTATTCTTT TTCATTTACA ATGTTCAATG CCGTAATTTG 1020 

TTTCATCGCT CTTGAACTCG CATCAAACTC AACTGGTAAC GTAACAATTG AGAATAATAC 1080 

CGCTAATGAC ATTAAACCAG CACCAATCCA TAAAGCAGTT GAACCaAATG CACTACCTAT 1140 

CGCTGTTAAG ATAATACCTA ACATGATGAT CATATAACTT AATGAACTCC CTAGGTTTGC 1200 

AACAGGTACT AATGCTGCTC TGAATCTTAA GAACCAATAT CCTTGGTGAT CTTGAATGGC 1260 

20 ATGACCAACT TCGTGGGCTG CAATTGCAGT TCCAGCAACT GATGGTCTGT CATAGTTTGC 1320 

AGGAGATAGT GAAACAACTT TCTTTTTAG6 ATCGTAATGA TCTGTTAAGA ATCCTTCACC 1380 

TTTAACAACT TCGACATCAT AAATACCGTT TGCATGTAAA ATTTCTAATG CAACTTCACG 1440 

25 ACCCGTTTTA CCACTAGTTG ATCTAACTTG TGAATATTTC TCATAGTTAG ATTTAACTTT 1500 

GTGTTGTGCC CATAAAGGAA GCACCATTAA TATTACGAAA TAAATTATCA TAGTAAAAAT 1560 

TGAAGACAAT AAACTCACTC TCCTTTATAA ATATTTTACT GTCATTTGCC GTTTTTATCA 1620 

AATCATTTAC ACTTTAATAA TTTGTTTAAT TCAATATAAA GCAAAAGTCC AAAAACACTT 1680 

AGACAACATG ATAATACACC AATTTGCCAC ACATGTGTAG TTATAAAATC ATAATATGGA 1740 

AATTGAAGGT GAAAATAGTC AATATAATCA TTCAAAAACA CCCAAATCAT yGCTACACTG 1800 

ATTCCAATCA TAGAACGTTT AAACCTAGGA TAGAAGTAAA TTGCCTGAAC AGCCATTATA 1860 

CTGTiGGAAA ACATTAATAC CAAACCATTT ACTGTAATAT CACCTTGTTC AATAATAAAT 1920 

AATATATTCA TTATAACTGC CCAAATCCCA TATTTGAATA ATGTTACAAA TGCCAGTGCA 1980 

TCGATAATAC TATTTT6TTT TTGAATTAAT ATCAATGAGA TAGAAATAAC TAAGTATAAT 2040 

ATTGCAGTTG GGCTATCTGG AACAAAAATC TTAAAATGCC AGGGCGTATG ACTTAATTGT 2100 

45 TCACCATACC ATATATAACC ATAAATCATC CCTAATATAT TACAAATGAG TAGCATCATT 2160 

AACCAAGAAC GTTQATAAAG TGTATATTGC CAAAATGCTT TAATTGTCAT CTGCTAAGTC 2220 

CTCAAATTGA TTATGTTTAT TTACTAGCTT GAGTGTATTT AAAATTTGCG TTAGTTGATA 2280 

SO AAAACGTTGC TTTTCATTCA TCTGTAAACT TAAATCAATA TTGTGTAACA AGTAATCTAT 2340 

TAATAACX3CA TGTTTATGCC GATCTATAGC CATACTATTT AAGTCATGAA GATAAGTTTG 2400 
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TGACACGTTT GCGAAGTGAA TTTGAATATC AAAAGCACAG TTATGATTAG CGATATAATC 2520 

AAATATTTCA TTTGTATTCA TTAACTTTAT ATTACGCTTA 6TAAATTGAA TTGCAGAAGC 2580 

^ GTGACTTCCC ACTTCTGCAA TTTCTAATOT TTCATGATGA TTAATTTTTG TATCTACAAA 2640 

ATGAATGTTT GCCAATTTCG CCTCATTCAC TTTTATATAG TTAAGCACCC AAACTGCAAT 2700 

ACGCGACTTA AATCGATATT GAAAAAGTAA ATATTCAATA AAACTTTCTT TAATTTGATT 2760 

10 

GAGTGTCTCT GACATCAAAT ACCCCATTTT AAGATTGCAA TCTTGaTAAT TCGTCATGCC 2820 

AATTTTCGTT ACTTGGcTCT AGTTCCAACA ATTGATTTAA AATAGTAATT 6CTTGTTCCT 2880 

TTTGACCAAT TTCAATTAAA TAQAAATAAT AATCACTCAT AAAATCAATA TTTGTTTTCA 2940 

TCGTTGGATA TGCTAATTCA AAGAAATGTT GAGCTTCTTT ATCTCGCTC 2989 
(2) INFORMATION FOR SEQ ID NO: 151: 

20 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 





CATCAACTCC 


TTAATTACAC 


TGTAAATGAT 


ATGCGTCTTT 


TTGACAACTA 


TATTTGTCAA 


60 


30 


ATCTACACCA 


AAAAATATGA 


TTATCCACCT 


ATGTATGACA 


TTTTGAAACA 


AACACCTCAA 


120 




CGCCTACAAG 


TCATAATTGT 


TTACTTTCGT 


TACACCTTCC 


TGCATAATTA 


ACAGCATTCT 


180 


35 


AATTTTAGTA 


TGATGCACGC 


ATTTTCACTA 


AATCAAACCA 


TTCAAAGGAG 


ACTATTATGG 


240 


CATTTACATT 


ATCTGCAATT 


CAACAAGCAC 


ATCAACAATT 


TACTGGTGTT 


GACTTTCCAA 


300 




AACmTTCAA 


AGCTTTTAAA 


GATATGGGGA 


TGACTTACAA 


TATCGTCAAC 


ATTCAAGATG 


360 


40 


GCACTGCAAC 


ATACGTACAT 


CAATCAGAAG 


ATGATATCGT 


TACGTCATCT GTAAAAAGTA 


420 




ATCATCCTGT 


TGCTCAAAAA 


TCAAACAAAA 


CAATAGTTCA 


AGACGTCTTA 


ACTAGACATC 


480 




AACAAGGGCA 


AACAGATTTT 


GAAACATTTT 


GTGATGAAAT 


GGCTGAAGCT 


GGCATTTATA 


540 


45 


AATGGCATAT 


CGATATTQnA 


GCGGGCACTT 


GTACTTATAT 


CGACTTGCAA GACCAAGCTG 


600 




TTATTTCAGA 


ATTAATCCCT 


CAATAAACTA 


TATTTATAGC 


AACATTTTAA 


TTATTTCATA 


660 




AAATTTTATT 


GATAATCATT 


ATCGTTCGGT 


ATAAAGTAAA 


TACTATATAC 


TACTTATGAG 


720 


SO 


TGAGGTTGAT 


TATCATGATA 


ACTAACACTT 


TTATTTTAGG 


CATCACAGGC 


CCAACAAGTC 


780 




TTGTCGTCAT 


TAGCATTATC 


GCTTTAATTA 


TTTTTGGTCC 


GAAAAAATTA 


CCACAATTTG 


840 
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AGTCTCACGA TACACCCAGT AAGGAATCGA AACAACAGCG AOAGCAATAG CACTGACCAC 960 

ACCTTACTGG TTCACTTTAG CGAACTACGC CATCGGTTAG TAAAAATTTT ATTGTCGTTC 1020 

GTCATTACGG TCATCGTCGT ATATGTyTCA TCATTTTGGT GGATGACACC ATTCATAACG 1080 

TATATyACCC GgCACATGTG TcCTTACATG CATTTcATTC ACAGAAATGA TACAAATAAC 1140 

GTG 1143 



10 

(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doxible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 



25 



30 



35 



40 



45 



SO 



CAACGCCTGA 


ACGTAAACCA 


TATCGTTTCG 


CGATTTCCTC 


ATCTTGACTA TTTACTAAAA 


60 


ACTCTCTCAT 


GGCGATTAAT 


GTTTCTTTTT 


CTTCTTTAGT 


TAATGGTAAT TCTAACTCAG 


120 


CTGCrmTG 


ACGCAAAGTT 


GGATGACCAT 


CTCTAATGAT 


GTCTTTCATT GTTAACATAT 


180 


ATTGCACCTT 


CCTTATTTTA 


ATTTGTTTTA 


GTTGAATGAC 


AGTAAAAAGG TTGTTAAGAT 


240 


ACTCATACAT 


TTTTATGTGT 


AAATATCTAC 


AAAGTTAACC 


AACTACTGCC AATGTTTATT 


300 


TTAGATAGTA 


TATGTAAATT 


TTCAaGAtAT 


GCgTAATTGC 


gTTAAAAAAT GaTTAAAGTG 


360 


TTGGTTTCAA 


GCAATGaTAC 


TTTAGAAATT 


TATTTATCAT 


CTTGACTTTA AAAATTATAT 


420 


TATAAATGAC 


GTAACTGTCA 


ACAGATATAC 


TTAGTArTGA 


AGATGTGTAA TGTAATTGTT 


460 


TAAAATTGAT 


TTCCAAGCAG 


ATTTTATTTA 


TCATTTAATT 


TAAATAGCAA GTGGAGGTAC 


540 


AAGTAATGAA 


ATTTGGAAAA 


ACAATCGCAG 


TAGTATTAGC 


ATCTAGTGTC TTGCTTGCAG 


600 


GATGTACTAC 


GGATAAAAAA 


GAAATTAAGG 


CATATTTAAA 


6CAAGTGGAT AAAATTAAAG 


660 


ATGATGAAGA 


ACCAATTAAA 


ACTGTTGGTA 


AGAAAATTGC 


TGAATTAGAT GAGAAAAAGA 


720 


AAAAATTAAC 


TGAAGATGTC 


AATAGTAAAG 


ATACAGCAGT 


TCGCGGTAAA GCAGTAAAGG 


780 


ATTTAATTAA 


AAATGCCGAT 


GATCGTCTAA 


AGGAATTTGA 


AAAAGAAGAA GACGCAATTA 


840 


AGAAGTCTGA 


ACAAGACTTT 


AAGAAAGCAA 


AAAGTCACGT 


TGATAACATT GATAATGATG 


900 


TTAAACGTAA 


AGAAGTAAAA 


CAATTAGATG 


ATGTATTAAA 


AGAAAAATAT AAGTTACACA 


960 


GTGATTACGC 


GAAAGCATaT 


AAAAAGGCTG 


TAAACTCAGA 


GAAAACATTA TTTAAATATT 


1020 


TAAATCAAAA 


TGACGCGACA 


CAACAAGGTG 


TTAACGAAAA 


ATCAwAAGCA ATAGAACAGA 


1080 
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AAGAAAAGCA AGACGTTGAT CAATTTAAAT AATTAATATA ATACAGATGG TAGGAAACAA 1200 

CTAATACAGT TCCTATTATC TGTATCTTTT TTTATTAAAA CAGAACTTTT TCAAATGGTT 1260 

TAACAGTCCC ATTTATTTGT GGTACAATTA GTAAGGATAA AATGAATTTC TATACAATTA 1320 

TGGGAAAGGT ATTGTGAATT GAATGGCTCC TAAGTTACAA GCCCAATTCG ATGCAGTAAA 13 80 

AGTTTTAAAT GATACTCAAT CGAAATTTGA AATGGTTCAA ATTTTGGATG AGAATGGTAA 1440 

CGTCGTAAAT GAAGACTTAG TACCTGATCT TACGGATGAA CAATTAGTGG AATTAATGGA 1500 

AAGAATGGTA TGGACTCGTA TCCTTGATCA ACGTTCTATC TCATTAAACA GACAAGGACG 1560 

TTTAGGTTTC TATGCACCAA CTGCTGGTCA AGAAGCATCA CAATTAGCGT CACAATACGC 1620 

TTTAGAAAAA GAAGATTACA TTTTACCX3GG ATACAGAGAT GTTCCTCAAA TTATTTGGCA 1680 

TGGTTTACCA TTAACTGAAG CTTTCTTATT CTCAAGAGGT CACTTCAAAG 6AAATCAATT 1740 

20 CCCTGAAGGC GTTAATGCAT TAAGCCCACA AATTATTATC GGTGCACAAT ACATTCAAGC 1800 

TGCTGGTGTT GCATTTGCAC TTAAAAAACG TGGTAAAAAT GCAGTTGCAA TCACTTACAC 1860 

TGGTGACGGT GGTTCTTCAC AAGGTGATTT CTACGAaGGT ATTAACTTTG CAGCAGCTTA 1920 

TAAAGCACCT GCAATTTTCG TTATTCAAAA CAATAACTAT GCAATTTCAA CACCAAGAAG 1980 

CAAGCAAACT GCTGCTGAAA CATTAGCTCA AAAAGCAATT GCTGTAGGTA TTCCTGGTAT 2040 

CCAAGTTGAT GGTATGGATG CGTTAgcTGT nATATCAAGC AACTAAAGAA GCACGTGACC 2100 

GCGCAgTTGC AGGTGAAGGT CCAACATTAA TTGAAACTAT GACATATCGT TATGGTCCTC 2160 

ATACAATGGC TGGTGACGAT CCAACTCGTT ACAGAACTTC AGACGAAGAT GCTGAATGGG 2220 

AGAAAAAAGA CCCATTAGTA CGTTTCCGTA AATTCCTTGA AAACAAAGGT TTATGGAATG 2280 

AAGACAAAGA AAATGAAGTT ATTGAACGTG CAAAAGCTGA TATTAAAGCA GCAATTAAAG 2340 

AGGCTGATAA CACTGAAAAA CAAACTGTTA CTTCTCTAAT GGAAATTATG TATGAAGATA 2400 

TGCCTCAAAA CTTAGCAGAA CAATATGAAA TTTACAAAGA GAAGGAGTCG AAGTAAGCCA 2460 

TGGCACAAAT GACAATGGTT CAAGCGATTA ATGATGCGCT TAAAACTGAA CTTAAAAATG 2520 

ACCAAGATGT TTTAATTTTT GGTGAAQACG TTGGTGTTAA CGGCGGTGTT TTCCGTGTTA 2580 

4S CTGAAGGACT ACAAAAAGAA TTTGGTGAAG ATAGAGTATT CGATACACCT TTAGCTGAAT 2640 

CAGGTATTGG TGGTTTAGCG ATGGGTCTTG CAGTTGAAGG ATTCCGTCCG GTTATGGAAG 2700 

TACAATTCTT AGGTTTCGTA TTCGAAGTAT TTGATGCX3AT TGCTGGACAA ATTGCACGTA 2760 

^ CTCGTTTCCG TTCAGGCGGT ACTAAAACTG CACCTGTAAC AATTCGTAGC CCATTTGGTG 2820 

GTGGCGTACA CACACCAGAA TTACACGCAG ATAACTTAGA AGGTATTTTA GCTCAATCTC 2880 



55 



30 



35 



40 



771 



EP 0 786 519 A2 





CTATTAGAAG 


TAATGACCCA GTCGTATACT 


TAGAGCATAT 


GAAATTGTAT 


CGTTCATTCC 


3000 




GTGAAGAAGT ACCTGAAGAA GAATATACAA TTGACATTGG 


TAAGGCTAAT 


GTGAAAAAAG 


3060 


5 


AAGGTAATGA 


CATTTCAATC ATCACATACG 


GTGCAATGGT 


TCAAGAATCA 


ATGAAAGCTG 


3120 




CAGAAGAACT 


TGAAAAAGAT GGTTATTCTG 


TTGAAGTAAT 


TGACTTACX5T 


ACTGTTCAAC 


3180 


10 


CAATCGATGT 


TGACACAATT GTAGCTTCAG 


TTGAAAAAAC 


TGGTCX3TGCA 


GTTGTAGTTC 


3240 


AAGAAGCACA 


ACGTCAAGCT GGTGTTGGTG 


CAGCAGTTGT 


AGCTGAATTA 


AGTGAACGTG 


3300 




CAATCCTTTC 


ATTAGAAGCA CCTATTGGAA 


GAGTTGCAGC 


AGCAGATACA 


ATTTATCCAT 


3360 


IS 


TCACTCAAGC 


TGAAAATGTT TGGTTACCAA 


ACAAAAATGA 


CATCATCGAA 


AAAGCAAAAG 


3420 




AAACTTTAGA ATTTTAATAC ATTTTAAAAG 


TTAACGAAGT 


TAGCGTATTT 


TAGTCTCATT 


3480 




GATTAAAATG 


AAATGTTTAA TTTACGAAAT 


CTTAGGAGGG 


CAAAAACGTG 


GCATTIGAAT 


3540 


20 


TTA6ATTACC 


CGATATCGGG GAAGGTATCC 


ACGAAGGTGA 


AATTGTAAAA 


TGGTTTGTTA 


3600 




AA6CTGGAGA TACTATTGAA GAAGACGATG 


TTTTAGCTGA 


GGTACAAAAC 


GATAAATCAG 


3660 




TAGTAGAAAT 


CCCATCACCA GCATCTGGTA 


CTGTAGAAGA 


AGTTATGGTA 


GAAGAAGGTA 


3720 


25 


CAGTAGCTGT 


AGTTGGTGAC GTTATTGTTA 


AAATCGATGC 


ACCTGATGCA 


GAAGATATGC 


3780 




AATTTAAAGG 


TCATGATGAT GATTCATCAT 


CTAAAGAAGA 


ACCTGCXIAAA 


GAGGAAGCGC 


3840 




CAgcAGaGCA AGCACCTGTA GCTACTCAAA GTGAAGAAGT 


AGATGAAAAC 


AGAACTGTTA 


3900 


30 


AAGCAATGCC 


TTCAGTACGT AAATACGCAC 


GTGAAAAAGG 


TGTTAACATT 


AAAGCAGTTT 


3960 




CTGGATCTGG 


TAAAAATGGT CGTATTACAA 


AAGAAGATGT 


AGATGCATAC 


TTAAATGGTG 


4020 


35 


GTGCACCAAC 


AGCTTCAAAT GAATCAGCTG 


CTTCAGCTAC 


AAGTGAAGAA 


GTTGCTGAAA 


4080 


CTCCTGCAGC 


ACCTGCAGCA GTAACATTAG 


AAGGCGACTT 


CCCAGAAACA 


ACTGAAAAAA 


4140 




TCCGTGCTAT 


GCGTAGAGCA ATTGCGAAAG 


CAATGGTTAA 


CTCTAAGCAT 


ACTGCACCTC 


4200 


40 


ATGTAACATT 


AATGGATGAA ATTGATGTTC 


AAGCATTATG 


GGATCACCGT 


AAGAAATTTA 


4260 


AAGAAATCGC 


AGCTGAACAA GGTACTAAGT 


TAACATTCTT 


ACCTTATGTT 


GTTAAAGCAC 


4320 




TTGTTTCTGC 


ATTGAAAAAA TACCCAGCAC 


TTAACACTTC 


ATTCAATGAA 


GAAGCTGGTG 


4380 


45 


AAATCGTTCA 


TAAACATTAC TGGAATATCG 


GTATTGCAGC 


AGACACTGAT 


AGAGGATTAT 


4440 




TAGTACCTGT 


TGTTAAACAT GCTGATOGTA AGTCTATTTT 


CCAAATTTCA 


GATGAAATTA 


4500 




ATGAATTAGC 


TGTTAAAGCA CGTGATGGTA 


AATTAACAGC 


CGATGAAATG 


AAAGGTGCTA 


4560 


50 


CATGCACAAT 


CAGTAATATC GGTTCAGCTG 


GTGGACAATG 


GTTCACTCCA 


GTTATCAATC 


4620 




ACCCAGAAGT 


AGCAATCTTA GGAATTGGCC 


GTATTGCTCA 


AAAACCTATC 


GTTAAAGATG 


4680 
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ATGGTGCAAC TGGCCAAAAT GCAATGAATC 
TATTATTAAT GGAGGGGTAA AACATGGTAG 

5 

TAGTAATCGG AGCAGGTCCT GGTGGATACG 
AAAAAGTAAC AATCGTTGAG AAAGGTAATC 
TTCCTTCAAA AGCATTACTA CATGCTTCTC 

10 

ACTTAGGTGT TATTGCTGAA AGTGTTTCTT 
CATCAGTTCT TAATAAATTA ACTGGTGGTG 
ACATCGTTAA AGGTGAAGCA TATTTCGTAG 
AGAGCGCACA AACATACAAC TTTAAAAATG 
AAATTCCTAA TTTCAAATTC GGTAAACGTG 

20 AAGAAGTACC aGGTAAATTA GTTGTAGTTG 

CAGCATTTGC TAACTTTGGT TCAGAAGTAA 
GTGGCTTCGA AAAACAAATG ACACAACCTG 

2^ AAATCGTTAC TGAAGCTATG GCTAAATCAG 

CTTATGAAGC TAAAGGCGAA GAGAAAACAA 
GTCGTCGTCC AAACACAGAC GAATTAGGCC 

30 

GTGGATTATT AGAAGTTGAT AAACAAAGCC 
GTGATATCGT TCCAGGTTTA CCACTTGCTC 
CTGAAGCAAT TGATGGTCAA GCTGCTGAAG 

3S 

TTACTGAACC AGAATTAGCT ACAGTTGGTT 
TAGOATTAA AGCTTCTAAA TTCCCATATG 
ATACTAACGG ATTTGTTAAA CTTATTACAC 

40 

AAGTAGTTGG TACTGGTGCA TCAGATATTA 
GTATGAATGC TGAAGATATC GCATTAACAA 
45 CTATGGAAGC AGCAGAAAAA GCTATOGGAT 

TCTATAAAGA TTCAGTCATT AAAAGCTGTA 
AAGTAATGTA AGGAAATTGA TTTGAGATAT 
CGATGCTAAT AAAAGAATTG AAATGGAGGG 
TTTAGACTGG AGTAATGAAG AGATGATTTC 

55 



ACATTAAACG TTTATTAAAT AATCCAGAAT 4800 

TTGGAGATTT CCCAATTGAA ACAGATACTA 4 860 

TTGCAGCAAT TCGTGCAGCT CAATTAGGAC 4920 

TTGGTGGTGT TTGCTTAAAC GTAGGATGTA 4980 

ACCGTTTTGT TGAAGCACAA CATTCTGAAA 5040 

TAAACTTCCA AAAAGTTCAA GAATTCAAAT 5100 

TTGAAAGCTT ACTTAAAGGT AACAAAGTTA 5160 

ATAACAATAG CTTACGTGTT ATGGACGAAA 5220 

CAATCATTGC AACAGGTTCA AGACCAATTG 52S0 

TTATCGACTC AACAGGTGCT TTAAACTTAC 5340 

GTGGAGGATA CATTGGATCA GAATTAGGTA 5400 

CCATCCTT6A AGGTGCTAAA GATATCTTAG 5460 

TTAAAAAAGG TATGAAAGAA AAAGGTGTTG 5520 

CTGAAGAAAC AGATAACGGA GTTAAACTTA 5580 

TCGAAGCTGA TTATGTATTA GTAACTGTAG 5640 

TAGAAGAATT AGGTGTTAAA TTCGCTGACC 5700 

GTACGTCTAT CAGCAATATC TATGCAATTG 5760 

ACAAAGCTAG CTATGAAGCT AAAGTTGCTG 5820 

TTGATTACAT TGGTATGCCA GCAGTATGCT 5880 

ATTCAGAAGC GCAAGCTAAA GAAGAAGGTT 5940 

CAGCAAATGG TCGTGCATTA TCATTAGATG 6000 

TTAAAGAAGA T6ATACTTTA ATCGGTGCTC 6060 

TCTCTGAATT AGGTTTAGCA ATTGAAGCTG 6120 

TCCATGCACA TCCAACATTA GGTGAGATGA 6180 

ACCCAATCCA TACAATGTAA TAACTGATTA 6240 

GCATATGCTA CGGCTTTTTT GTTTTAGGTA 6300 

CGTTAACATG TGACATGCAT GTTATACTAG 6360 

TTCAACAATG GAATATGAGT ATCCAATTGA 6420 

AGTGATAAAT TTCTTTAATC ATGTAGAGAA 6480 
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AATTGTGCCT GCTAAAGCAG 


AGGAAAAACA AATTTTTAAT ACTTTCGAAA AAAGTAGTGG 


6600 


CTATAATAGT TACAAAGCAG 


TTCAAGATGT 


AAAAACTCAC 


TCTGAAGAAC 


AAAGAGTAAC 


6660 


AGCTAAAnAA TAATTCGTTC 


GAAATTAACA 


CAATTTAATA 


GGAATTTTTC 


TTTAAAACTA 


6720 


TTGCTAATAA AGCTATATTT 


TGATACCTTT 


ATCAAGTGTT 


AAACAAAATG 


TTTGATAAAA 


6780 


GTAAACTTAA TATAGCTTTT 


TTAGGTGGAA AAATAAATGA ACATAGGTAA TAAAATTAAA 


6840 


AATCTTAGAA GAATTAAAAA 


TTTAACGCAA 


GAAGAACTTG 


CTGAACGTAC 


AGACTTATCG 


6900 


AAAGGCTACA TTTCACAAAT 


AGAAAGTGAA 


CATGCCTCAC 


CAA6TATGGA AACTTTCTTA 


6960 


AATATTATAG AGGTGTTAGG 


AACGACGCCA AGTGAATTTT TTAAAGACAG TGAAAATGAA 


7020 


AAAGTATTAT ACAAGAAGGA 


AGAACAAGTT ATTTATGATG AGTATGATGA AGGTTATATA 


7080 


TTAAATTGGT TAGTTTCAAA 


GTCAAATGAA TATGATATGG AGCCATTAAT ATTAACTTTA 


7140 


AAGCCTGGAG CATCATATAA 


AAATTTTAAT 


CCATCAGAGT 


CTGATACGTT 


TATTTATTGT 


7200 


ATGTCAGGTC AGATAACACT 


TAATTTAGGC 


AAAGAGATAT 


ATCAAGCACA 


AGAAGAAGAC 


7260 


GTTTTGTATT TTAAAGCACG 


AGATAATCAT 


CGTTTGTCAA ACQAATCAAA 


CAATGAAACA 


7320 


CGAATACTTA TTGTAGCGAC 


AGCTTCATAT 


TTATAGGGGG 


GATCTTATTT 


GGAACCGTTA 


7380 


TTATCATTAA AATCAGTTAG 


TAAAAGCTAT 


GAT6ATCTTA 


ATATCTTAGA 


TQACATAGAT 


7440 


ATTGATATTG AATCAGGATA 


CTTTTATACA 


TTATTAGGTC 


CTTCAGGTTG 


TGGTAAAACA 


7500 


ACAATTTTAA AATTAATTGC 


AGGGTTTGAA 


TATCCTGACA 


GTGGTGAAGT 


GATTTATCAA 


7560 


AACAAACCAA TTGGTAATTT 


ACCACCAAAT 


AAACGTAAAG 


TGAATACAGT 


CTTTCAAGAT 


7620 


TATGCATTAT TTCCACACTT 


AAACGTCTAT 


GATAATATCG 


CTTTTGGTTT 


GAAATTAAAA 


7680 


AAATTATCAA AAACCGAAAT 


TGATCAAAAA 


GTAACTGAGG 


CATTAAAATT 


AGTAAAACTT 


7740 


TCAGGTTATG AAAAAAGAAA 


TATTAATGAA ATGAGTGGCG 


GACAAAAGCA AC6TGTTGCA 


7800 


ATTGCACGTG CTATCGTAAA. 


TGAACCAGAA 


ATATTATTGT 


TAGATGAATC 


TTTATCCGCA 


7860 


TTAGATTT6A AATTGCGTAC 


TGAAATGCAA 


TATGAATTAC 


GAGAATTGCa ATCTAGATTA 


7920 


GGtATTACAT TTATATTTGT 


aACACATGAT 


CCA 






7953 


(2) INFORMATION FOR SEQ ID NO: 153: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2347 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
SO (D) TOPOLOGY: linear 
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GGCGTGATCA TACGACCGTC ATTCATGCTC ATGAAAAAAT ATCTAAAGAT TTAAAAGAAG 60 

ATCCTATTTT TAAACAAGAA GTAGAGAATC TTGAAAAAGA AATAAGAAAT GTATAAGTAG 120 

GAAACTTTGG GAAATGTAAT CTGTTATATA ACAGCACTAA TGATnACAAT CATTTTTTAC 180 

ATTTCTATAT GCTAATGTGG CAAGATGAGC AAAACTCATT TTGTGGATaA TGTTTaAAAG 24 0 

TCATACACAC CATACACAAG TTATCAACAT GTGTATAAyT cGcCAAATCT ATGTTTTTAA 300 

GACTTATCCA CCAATCCACA GCACCTACTA CTATTACTAA GAACTTAAAA CCTATATAAT 360 

TATATATAAA CGACTGGAAG GAGTTTTAAT TAATGATGGA ATTcACTATT AAAAGAGATT 420 

ATTTTATTAC ACAATTaAAT 6ACACATTAA AAGCTATTTC ACCAAGaACA ACATTACCTA 480 

TATTAACTGG TATCAAAATC GAT6CGAAAG AACATGAAGT TATATTaACT GGTTCAGACT 540 

CTGAAATTTC AATAGAAATC ACTATTCCTA AAACTGTAGA TGGCGAAGAT ATTGTCAATA 600 

20 TTTCAGAAAC AGGCTCAGTA GTACTTCCTG 6ACGATTCTT TGTTGATATT ATAAAAAAAT 660 

TACCTGGTAA AGATGTTAAA TTATCTACAA ATGAACAATT CCAGACATTA ATTACATCAG 720 

GTCATTCTGA ATTTAATTTA AGTGGCTTAG ATCCAGATCA ATATCCTTTA TTACCTCAAG 780 

25 TTTCTAGAGA TGACX3CAATT CAATTGTCGG TAAAAGTGCT TAAAAACX5T6 ATTGCACAAA 840 

CAAATTTTGC AGTGTCCAcC TCAGAAACAC GCCCAGTACT AACTGGTGTG AACTGGCTTA 900 

TACAAGAAAA TGAATTAATA TGCACAGCGA CTGACTCACA CCGCTTGGCT GTAAGAAAGT 960 

TGCAGTTAGA AGATGTTTCT GAAAACAAAA ATGTCATCAT TCCAGGTAAG GCTTTAGCTG 1020 

AATTAAATAA AATTATGTCT GACAATGAAG AAGACATTGA TATCTTCTTT GCTTCAAACC 1080 

AAGTTTTATT TAAAGTTGGA AATGTGAACT TTATTTCTCG ATTATTAGAA GGACATTATC 1140 

CTGATACAAC ACGTTTATTC CCTGAAAACT ATGAAATTAA ATTAAGTATA GACAATGGGG 1200 

AGTTTTATCA TGCGATTGAT CGTGCCTCTT TATTAGCGCG TGAAGGTGGT AATAACGTTA 1260 

TTAAATTAAG TACAGGTGAT GACGTTGTTG AATTGTCTTC TACATCACCA GAAATTGGTA 1320 

CTGTAAAAGA AGAAGTTGAT GCAAACGATG TTGAAGGTGG TAGCCTGAAA ATTTCATTCA 1380 

ACTCTAAATA TATGATGGAT GCTTTAAAAG CAATCGATAA TGATGAGGTT GAAGTTGAAT 1440 

45 TCTTCGGTAC AATGAAACCA TTTATTCTAA AACCAAAAGG TGACGACTCG GTAACGCAAT 1500 

TAATTTTACC AATCAGAACT TACTAAAAAT AAATATAAAT AAAGGATGAC GTGATTAATT .1560 

AAAACGTCAT CCTTTATTTT TTGGCAAAAA TAATTCTAGG TGCGTATGTA AAATAAATTT 1620 

^ GGCAGCATTT TAAACAGCAA ATAAAAGACX5 CCAATTAAAT TTATGACAAA TGTATCCAAA 1680 

ATTTAATAAG TGTGCTTATA TGCCCTTTAA ATTTAAAATT TTAATAGTCA ATAACAAGTT 1740 
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AAAAATAAGA ATTAATTATT TATATGTAAA CGGTTTCTAC CTCTATTTTA AATGAAATTT I860 

GTGACAAAAA AAGGTATAAT ATATTAATGA CATACAAAGA AATGGAGTGA TTATTTTGGT 1920 

TCAAGAAGTT GTAGTAGAAG GAGACATTAA TTTAGGTCAA TTTCTAAAAA CAGAAGGGAT 1980 

TATTGAATCT GGTGGTCAAG CAAAATGGTT CTTGCAAGAC GTTGAAGTAT TAATTAATGG 2040 

AGTGCGTGAA ACACGTCGCG GTAAAAAGTT AGAACATCAA GATCGTATAG ATATCCCAGA 2100 

ATTACCTGAA GATGCTGGTT CTTTCTTAAT CATTCATCAA GGTGAACAAT GAAGTTAAAT 2160 

ACACTCCAAT TAGAAAATTA TCX^TAACTAT GATGAGGTTA CGTTGAAATG TCATCCTGAC 2220 

GTGAATATCC TCATTGGAGA AAATGCACAA GGGAAAGACA AATTTACTTG GAATCAATTT 2280 

ATACCTTAGC TTTAGCAAAA AGTCATAGAA CGAGTAATGG ATAAGGGACT CCATACCGTT 2340 

TTAATGC 2347 
20 (2) INFORMATION FOR SEQ ID NO: 154: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

ACAAGACGTn TCTATAACTT ATCTGAAATC GCTCGTCAAG ATAAAGATTA TGCAACTATC 60 

TCATTCTTAA ACTGGTTCTT AGATGAACAA GTCGAAGAAG AATCAATGTT TGAAACTCAC 120 

ATCAATTATT TAACTCGTAT CGGCGATGAC AGCAATGCAT TATATCTTTA CGAAAAAGAA 180 

CTTGGCGCTC GTACATTCGA CGAAGAATAA TTAAACATCA CTACAATAGA CAGATAAATA 240 

TCATACGACA TGATAGGCAT TTGGGTCACT TACAATAACC CAATGTCTAT ATTATTTTGC 300 

TTTACGGAGA TCACTAGATT CATTTTCTGA ATCATTGATC TGCGTTTTTT CATTTTCAAG 360 

GCTAATTATT GTATTTTTAG TCATTTATTT TTTAAACTAC TAATGTTAAT AACTCTAAAT 420 

TTGATGTTGA ATTAATTTGA CGATTTTAAA GCATATCATC ATTTACTTTT TAATCAGAGT 480 

45 TACATCCAAA TGATAGATTT CACGTTATAC CTTCACGTAT AATATTATGT ATCGTTTGTA 540 

AGCAAATGAC TAAAAGTCTA TTAATATATA CATTTAATTA ATTGAAAGGA TTGACTACAT 600 

GATACAAGAT GCGTTTGTTG CACTTGATTT TGAAACAGCA AATGGTAAAC GTACAAGTAT 660 

TTGTTCTGTC GGAATGGTTA AAGTCATTGA TAGTCAAATA ACAGAAACAT TTCATACTCT 720 

TGTGAATCCG CAAGACTATT TTTCACAACA AAATATTAAA ATTCATGGCA TACAACCAGA 780 
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aGATTTACCT GTTGTCGCAC ATAACGCGGC ATTTGATATG AACGTCTTAC ATCAAAGCAT 900 

TCAAAATATT GGTTTACCAA CTCCAAATTT AACTTACTTT TGTAGTTATC AACTTGCTAA 960 

AAGAACCGTT GATTCGTATC GATACGGTTT AAAACATATG ATGGAGTTTT ATCAATTAGA 1020 

TTTTCATGGT CATCATGATG CATTGAATGA TGCCAAAGCA TGCGCAATGA TTACTTTTAG 1080 

GCTACTGAAA AATTATGAAA ATTTAACATA TGTAACTAAT ATTTATGGTA AAAATCTAAA 1140 

AGATAAAGGC TAGGACTAAA TAAAATACTC CCTTCAAAAG TAAGCATTGT AAAAATGTAA 1200 

ACTTTGCAGG GAGCTTTATT TTATATAAAG TCATATATCX5 TCATATTTTT ATAAGTTGAT 1260 

TGTTCTAAAT TACCTACAGT GACACCAATA AGTCGAATTG GTACATCAGG GTCTTTTAAA 1320 

TCXSTTATAAA GTAAATATGC AATATTATAA ATATCTTCTT CAGAACTAAC CGAATCTCTT 1360 

AAACTCATCT GTTTAGATAG CGTTTCAAAT TGATAAGTTT TAATTTTAAC CGTTACAGTT 1440 

20 TTAGCTGACT TCTGTAATTT ATTTAGACGT TCAGCTGTTT TACCTGnACA ATTCCCATAC 1500 

TTTTCTTAAA ATCTCTTCAT CATCATTCAC GTCTGTTGCA AATGTGCGTT CAGTCCCTAC 1560 

TGATTTTCTT ACTCTTGATG ATTTCACTTC ACTATGGTCA ATACCGCGTG CCTTGTTATA 1620 

25 TAAACCCCGA CCTCTTTTTC CAAACAAACG TATTAATTCA AATTCCGTTT TCTCATATAA 1680 

ATCTCTACCG TTAAAAATAC CATTATCATG CATTACTTTT TT6GAA6CTT TACCTACX5CC 1740 

TGGaAAATCT CCAATATCCA ATGTCATCAA AATATCATGG aCATTTTGAT AATCAATCAC 1800 

AGTCATACCA TCAGGTTTAT TCATACCACT CGCTAATTTA GCTAAAAATT TGTTATAAGA 1860 

AACACCTGCA GATGCTGTTA AATGTGTCTG CTCTAGAATA TCTTTTCTAA TATACTGAGC 1920 

AATTTTCGAA GCAGGAAGGT CTGGTCTCAC TAATTCTGTA ATATCTAAAT ACGCTTCATC 1980 

CAATGACATC GGTTCTACCT TATCTGTATA ACTTCGGAAA ATAGACATAA TCTGCGCAGA 2040 

TGTTTCTCGG TAAGCACCAA AATTACTTGT GACAAAGTAT CCATTTGGAC ATAATTTATG 2100 

CGCTTGTGAC ATAGGCATTG CTGAATGGAC GCCGTATTTT CGTGCTTCAT AGGATGCCGT 2160 

AGAGACAACA CCCCTACTGC TTGCTTTACC ACCAACAATG ACTGGTTTCC CTTTCAATTT 2220 

GGGGTTATCT CTCATTTCGA CTTGTGCAAA AAAATAGTCC ATATCTATAT GAATAATTCG 2280 

^ TCTCTCAGTC AAGTGCTCAC CTCCCTACTA ATTTTTACTT TTATAACGCA CAAAAATATC 2340 

TCAACATAAT TATACGCTGT GTACGATTTT TTTACATAAA TCTTGCACTT AGCGATAACT 2400 

ATATTGaGAT AACTACAAGT TGTTATaAAA TCAATTGCTA TTTAAGCATG ATGATGAAGA 2460 

50 CGATTGAGTA AGAAAACATA GGTAATCTGA AATAATTCAA GCAAATTCAT TTTGTTGGTA 2520 

TCATCATATT AAAATTTATT ATTGAGTCGG CTTTTGATGA TACAAATAAA TACTATCTTC 2580 
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AAAGCAATAA GCGGTATGCA TACTAAACAT AAAAATAAGT GATGAATAAC CAAATACCTT 2700 

AATTAAAATA AGCAAGCCAG TACTTAATAG GATTAGTGGT GACAGCATAA TAATTGAGAA 2760 

TTGCCATTTG TTGAAGCAAG CATCTGCTGT TTGGAATAAG ATTCTGTCTT TTTTTATATT 2820 

AAACATAGGT TTGCTATCTT TTTTAAATAA AAGAAATAAT GCTCTATGGA TAAGTTCATG 2880 

TAAAATCAAT AAAATAATGA ATCCAGCAAA CCCATATACA AGATTGATGA TGATATTTTG 2940 

ATCGACAACC GCTGTGACAC CTAACGCCCA CTTATACGTA AATAAAATCA CGAATAACGC 3000 

AATAACAAGT TGCAAGATAA TAAACCTTCG CATTTGAAAA TTATTT6TCX3 TTAAATCAAT 3060 

TTTATGCATT ACCAACCCTC CCGATCATGA CATTCTTATT CTTCTTTAAA TATAGTATAC 3120 

AATGTCACAT TTAATTTAAA AAGTTCATAT CAAGAAAGTA AATTGGCTGT AATAAAATTT 3180 

TAATATACGA CTTCTTTCTT CACTTATTAA GGCX3AAATTT TATCtCAAAT CATGTGCGCT 3240 

20 ATTTCAAATT GAATAATQCC ACTGTCTCAA CATGTGTTGT TTGTGGAAAC ATATCTACCX5 3300 

GTGTTACCTC TTCAAGTTGA TATTTTTCAG CTAATAATAA TGCATCACX5T TGCTGTGTTG 3360 

CGGGATTACA TGAAATATAG ACAATACGCT TAGGTTCTAA TG7AAGCAAA GTCTGAATAA 3420 

2S ACGTTTCGTC ACAGCCCTTT CTTGGCGGAT CAACCATTAC AACATCTGGT TTAATCCCTT 3480 

GTGCTTTCCA TTGTAAAATA ACTTCTTCAO CTTTCCCACA GACAAAAGTT GTATTATT6C 3540 

ATTGGTTTAT AGTCGCATTT TGTTGTGCGT CTTCAATTGC AGAAGGTACT ACTTCAACAC 3600 

CGTATACATG TTTTGCAAGT GGTGCCATAT ATAGCCCTAT TGTTCCAATA CCACAATAGG 3660 

TATCTAATAC AACTTCATTA CCTGTCAATT GCGCATACTC AATTGCTTTA TTATATAATT 3720 

TCTCTGTTTG TTCAGAATTA ATTTGGTAGA ATGACTGATC ACTTATTTTA AATGTACTAT 3780 

CTGTTAATTG ATCAATAATT GTATCTTTAC CATATAGCGT TATAGATTGA CGTCCCATAA 3840 

TAACATTAGA GTGGCTATCA TTAAT G TT TT GTTTAATGCT TGTCACATTA GGAAATGCAT 3900 

CTAATATCTT CTCAACAACA GCATTTTTTT GTGGCCACTT TTTACCATTA GTTACAAAAA 3960 

TAATCATCAT TTC6TCTGTA TGATATCCTG TTCTTACAAC CAAATGTCTC ATTAAACCTT 4020 

TTTTCAATTG TTCTTGATAA ATACTTACAT TTAAATCTTT TAAAATAGAT TTAACTTCAT 4080 

45 TCATCACTTC TTGATGTTGT GAATCTTGTA TTAAACAACT TTCCATGTCA ATAATGTCAT 4140 

GGCTTCTTTG ACGATAAAAG CCCATAATAA CTTCATTCTG TTCATTCTTA CCAACTGGAA 4200 

TCTGGGACTT GTTTCGATAT CTCCAAGGAT CTGTCATGCC AACTGTATCG TTAATCTTAG 4260 

AATTATCAAA ATGCGCTTTT CGCTGAAACA AATTAATCAC TTGTTCCTTT TTCATTTCAA 4320 

GTTGTGCTTC GTATGATAAG TGTTGAAGTT GGCACCCACC ACAACGTTCA TAATATATAC 4380 
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AGTTCTTTTT TACTTTGATA ATTTTATATT CAATTTGTTC ATTAATTAAA GCTTGTGGTA 4500 

TGAAAATAQG AAAGOGATCT ATmTACGA CACCATGGCC TTCATGCGTT AAATCAACAA 4560 

CTGTTCCCGT TTTTATGTCA TTTTTAGCTA TTGCTTGCAA AATTTTACCT CCAAAATGAA 4620 

CAGGTTAGGA ACAAAATTAT GCGCTTCCTA ACCTGCCATT ATATATTTCA CTATTTCTGT 4680 

TTATTCTTCG ATTAAATTGT CATCAACATG ATCATTATTT ATTAACTCTT CATTTACAAT 474 0 

ATCATTAGGT GCAAAGACAT CTATATGACG TTCTAGATTT AAGAAATTCG CTGGTAATTT 4800 

ACCACCATAT TCTCCATCTA CATTTAGTTG TAAGTCTGTG AATGATGAAA TATTAATTGC 4860 

CTTTGCTTTT TCATAAATAA CTTTAGGATG CTTAGTATGT TCTCCTCTTG AAGCTAAAGT 4920 

CATAATATGA CCAAGTTCTG CAAGGTTTGA TTTTTCAACT ATAATTAACG TAAAATAGCC 4980 

GTCATCTAAC TTAGCGTCCG GCACTAATTT TTCAAATCCT GCCATTGAAT TTGTTAAACC 5040 

TAAAAAGAAT AATAATGCTT CTCCTTGGAA AACATTACCA TCATATTCAA TTCTTAAATC 5100 

TACAGCTTTC ATTTGAGGTA ACATTTCGAA ACCTTTGaTG TAATAAGCAA ATGGACCAAC 5160 

AATAGATTTC AATTTACTCG GTGTTTCATA AGAGACTTGC GTCAATTGTC CGCCTGCAGC 5220 

2S TAAATTAATA AAGTATCGAT TATTCATTTT ACCAATATCT ACTTTAGTAG AATGACCTTC 5280 

AATGATGACA TCAAGTGCCC CCATGATGTC ATTAGGTATA TGCAATGCAC GTCCAAAGTC 5340 

ATTAACAGTA CCCATAGGAA TGACACCTAG CTTAGGACGA TTAGGCTTTT CTGCGATACC 5400 

^ ATTAACTACT TCATTTAATG TTCCATCACC ACCTGCAGCG ATTAATACAT CATAATTTTC 5460 

ATGCATAGCT CTTTCTGCTT CAAGTGTGGC ATCACCTATT TTCTCGGTTG CATATGCACT 5520 

CGTTTCATAT CCCGCTTTTT CTAATTTTAT TAAGGCATCA GGTAATTCTC TTTTAAATAG 5580 

CTCTTTACCT GATGTCGGGT TATAAATGAT TCTAGCACGT TTCCTCATAT CTTATCCCTC 5640 

TAOTAAAAT TCATATATTT TAACTTCATC TTTGTTTCGT CTAATAGGGA GTGGGACAGA 5700 

AATAATATTT AACAAAATTT ATTTCGTTCT ACCCCAACTT GCATTGTCTG TAGAATTTCC 5760 

TTTCGAAATT CTCTATGTTG GGGCCCCACC CCAACTTGCA CATTATTGtA AGcTGACAGA 5820 

AAGTCAGCTT CTTTGTTTGG GGGCCCCGCC AACTTGCACA TTATTGTAAG CTGACAGAAA 5880 

ATCAGCTTCT ATGTTGGGGC CCCACTAGAA TTGAAAAAAG CTTGTTACAA GCX3TATTTTC 5940 

TTTCAGTCAA CTACAGCCAA TATAACATTG TAGT6CCTAG GACATTGAAT TTATGACCCA 6000 

GGCTCAGTCT TATTTCATCA TTCTTAATAT CGTTAAAGAC CAACTTGTAT CTTAAACAAA 6060 

SO TACTATCTCA ATATGTACAA AGCTTGTTAT TTATTCAGCA TTTTTTGCCX5 TTCTTCATTA 6120 

TAtAGcTTCG TCAGTTATGC TATTTTACCT TTAAAAT6AT QTTQTAAATA TAATG1TGTC 6180 
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AACGCATTAA TAAAATTAAT ATTTTTACCA TTAACATGTA CAATGAATAA AGTTAAAAGT 6300 

AATTTGACTT CTATAGATAT AAATAAACCC TCGATTGCAT CTAAGTCAGC AATCAAGGGT 6360 

TTATTTTTTA AATCTTCATA GTTTGATGAT TTAAATTATC TTTTATCTAA TTCTTGTTTT 6420 

AATAGTTGAT TTACTAATTG TGGATTAGCT TGACCTTTAG ACGCTTTCAT AATTTGACCA 6480 

ACTAAGAAGC CCATAGCTTT GCCTTTACCA TTTTTGTAAT CTTCAACTGA TTGTTCGTTA 6540 

TTGTCTAATG CTTCATTTAC AAATTTTAGA AGTGTTGCTT CATCAGAAAT TTGAACTAAG 6600 

CCATTATCTT CCATAATCTG TTTAGCATTA CCACCTTTAG CTGCTAACTC TGGGAAGACT 6660 

TTCTTCGCAA TTTTACTGCT CATTGTTCCX3 TCTTCGATAA GTTTAATCAT ACCTGCTAAA 6720 

TTTTCTGGTG TTAATTTAGT ATCTAATAAT TCTACTTGAT TTTTATTTAA ATATTCGTTT 6780 

ACGCCACCCA TTAACCAGTT AGATGTTAAT TTAACATCTG CACCGTGTTC AATTGTTGAT 6B40 

20 TCAAAGAAAT CTGACATTTC TTTAGTCAAT GTTAATACX5T GTGCATCGTA TGCAGGTAAA 6900 

CCTAATTCAT TTACATACTT AGCTTTACGT. TCATCTGGTA ATTCAGGAAT TGTCTGACGA 6960 

ACACGCTCTT TCCAAGCATC ATCAATATAT AAAGGTACAA TGTCAGGCTC TGGGAAGTAA 7020 

25 CGGTAATCAT CAGAACCTTC TTTAACACGC ATTAAAATTG TTTTACCTGT AGATTCATCA 7080 

AATCGACGTG TTTCTTGTCC GATTTCTCCA CCATTTAACA ATTCTTCTTC TTGGCGTTTT 7140 

TCTTCATATT CTAAACCTTT ACGTACATAG TTAAATGAGT TTAAGTTTTT CAATTCGGCT 7200 

TTAGTACCAA ATTTTTCTTG ACCATATGGA CGTAAAGAGA TGTTAGCATC ACAACGTAAA 7260 

GATCCCTCTT CCATCTTAAC GTCTGATACA CCAGTGTATT GAATAATTGA ACGCAATTTT 7320 

TCTAAATATG CATATGCTTC TTTAGGTGAA CGAATATCTG GTTCAGATAC GAITTCAATT 7380 

AGCGGTGTAC CTTGACGGTT CAAGTCAACT AATGAATACT CACCTTTATG TGTTGACTTA 7440 

CCAGCATCTT CTTCCATGTG AAGAOGAGTA ATACCX3ATTC GTTTTGTTTC ACCGTCGACT 7500 

TCGATATCGA TATATCCATT TTCACCAATT GGTTGATCAA ATTGAGAAAT TTGATATGCT 7560 

TTTGGATTAT CTGGATAGAA ATAGTTCTTA CGGTCAAACT TAGATTCTGT TGCGATTTCC 7620 

ATATTTAGTG CCATTGCAGC ACGCATTGCC CAGTCTACTG CACGCTTATT AACAACTGGT 7680 

45 AAGACACCTG 6ATATGCTAA GTCGATAACA TTTGTATTTG AGTTAGGTTC TGCTCCAAAA 7740 

TGCGCTGGTG ATGGAGAAAA CATTTTTGAG TCCGTTTTTA ACTCTACGTG AACTTCAAGT 7800 

CCTATAACTG TTTCAAAATG CATGATTTCC ACTCCTTATA ATTTTTCATA AACGTCATGT 7860 

^ AAATTGTATT GTGTTTCATA TTGATAAGCG ACACGATATA ACGTTTTTTC ATCGAATGGT 7920 

TTACCAATGA ACTGTAAACC GATTGGTCGG CCATTTGATT GTCCACAAGG AACAGAAATA 7980 
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GGATCATCAA TTTCTTCACC TAAATTAAAC GCaGTgTnAG GCGCTGTTGG ACCAACTACT 8100 

ACATCATAAT TTTCGAATAC TTTATCAAAG TCATTTTTAA TCAATGTTCT AACTTTTTGA 8160 

GATTTTTTAT AGTAAGCATC ATAGTAACCT GAACTTAATG CAAATGTACC TAAGAAAATA 8220 

CGACGTTTTA CTTCTTTACC GAAACCTTCA GATCTTGACA TTTTATATAA TTCTTCTAAT 82 80 

GAATGAGCTT CTTTAGAATG ATAACCATAA CGAATTCCGT CAAAACGAGA AAGGTTTGAC 8340 

GAAGCTTCTG ATGATGCAAT CACGTAATAT GATGGAATAC CAAATTTAGT ATTTGGCAAT 8400 

GATACTTCCT CAACGACAGC ACCTAAAGAT TTTAAAGTTT CTACAGCGTT TTGAACTGCT 8460 

TCTTTTACGT CATCAGCTAC ACCTTCACCT AAGTATTCTT TAGGTAATGC AACTTTTAAT 8520 

CCTTTAATAT CTTTACCAAT TTCAGATGTA AAGTCTACAT CATCAACTGG TGCACTTGTA 8580 

GAGTCATTAA CATCTGCACC AGAAATAGCT TCTAATACX3A TTGCATTATC TTTTACATTT 8640 

20 CGAGTCAATG GACCAATTTG GTCTAATGAA GATGCAAAAG CAACTAATCC AAATCGAGAT 8700 

ACACGACCGT ATGTTGGTTT CATACCGACA ACGCCACAAT ATGCAGCCGG TTGTCTAATT 8760 

6AACCACCTG TGTCTGAACC TAAGCTAAAT GGTACTAAGC CAGCTGCAAc TGCTGCTGCA 8820 

25 GATCCACCTG ATGAACCACC TGGCACTGCT TTATGGTCAA ATGGGTTAAC TGTTTTTTTG 8880 

AAATAAGATG TTTCTGTTGA ACCACCCATT GCAAACTCAT CCATATTTAA TTTACCGATT 8940 

AAAACX5GCAT TTTCATTATG TAGTTTTTCC ATTACACTAG ATTCGTAAAT TGGCACAAAA 9000 

CCTTCTAACA TTTTACTTGC ACATGTTGTT TCTAATCCGT TTGTAATAAT GTTATCTTTT 9060 

ATACCCATTG GAATACCAAA TAATTTGCCA TCCATTTGAT CTTTTGCTTG TAATTCATCC 9120 

AATTCTTGCG CTTTTTTGAT TGCATTTTCT TTATCCAGCG CTAGAAAAGA CTTAATTGTT 9180 

GGATCAGTCT CTTCAATTGC ATCATATATA TCTTTAACAA CATCAGATGG TTTGATTTTT 9240 

TTGTCTTTTA TTAAAGTTAA TAAATTCTCA ACCGATTCGT AGCGAATGCT CATCTTACGC 9300 

GTCCTCCTCA TTCATGATTG TAGGCACTTT AAATTGTCCA TCTTCTOTTT CTTTGGCATT 9360 

TTTCAAAGCT AATTCTTGTG GAATACCTTT AATTGCTTTA TCTTCACGTA AAACGTTTTG 9420 

TAAATCTAAA AOGTGATATG TAGGTTCAAC GCCTTCTGTA TCAGCGCTAT CATTTTGTTT 9480 

45 TGCAAAATCT AAAATGCTTT CTAATGTGTT GGCCATTTCT TCCGTTTCTT CAGGAGAAAT 9540 

TTGAAGTCTT GCAAGATTCG CGATATGCTC AACTTCTTCA CGTGTTACTT TTGTCATTAA 9600 

TAAAAGCCTC CTTTAAGTCA TTCATCACTA AATTGTATCA AATTTCCAAT TAAAAATCTA 9660 

50 AGTATTTATG AGGTGCTACT TTAATTTCAT ATAAACTGTA TAAACATTAT CATTCGTTTA 9720 

TCAAATCATT TTTTATGAAA ACAACACTCT TTTAATATTA GACAACCCAA TTCAATATTA 9780 
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TATATTGGTA TGCAAGTATT TCAAAAAGAA TAAATTTAAT TTTCCTACTT TTCTAAACAT 9900 

TTATCTTTAT GTATAATGTT TTCAAGTAAC TAAATTATAA ATTAAATAAA GGGAGTGTTT 9960 

^ ATCATGCTTA CAATGGGGAC AGCATTAAGT CAACAAGTAG ATGCCAATTG GCAAACTTAT 10020 

ATTATGATTG CCGTCTACTT CTTGATACTA ATCGTTATTG GCTTTTACGG TTACAAGCAA 10080 

GCAACTGGTA ACCTAAGCGA GTACATGTTA GGTGGACXJTA tATTGGACCG TATATTACTG 10140 

10 

CATTATCAGC TGGAGCTTCA GATATGAGTG GATGGATGAT TATGGGGCTA CCTGGTTCTG 10200 

TCTATAGCAC TGGTCTATCA GCTATGTGGA TTACAATCGG TTTAACATTA GGTGCTTATA 10260 

TAAATTACTT TGTTGTTGCT CCTAGACTTC GTGTTTATAC CGAATTAGCT GGAGATGCAA 10320 

IS 

TTACATTACC AGATTTCTTT AAAAATCGTT TAAACGATAA AAATAATGTG TTAAAGATTA 10380 

TTTCTGGATT GATTATCGTA GTATTCTTTA CATTATATAC ACATTCTGGT TTCGTATCTG 10440 

20 GTGGTAAACT ATTTGAAAGT GCTTTTGGAT TAGATTATCA TTTCGGTTTA ATATTAGTTG 10500 

CTTTCATTGT CATTTTCTAT ACTTTCTTTG GTGGATATTT AGCTGTATCA ATTACAGATT 10560 

TCTTCCAAGG TGTCATTATG TTAATTGCGA TGGTTATGGT CCCTATTGTT GCTATGATGA 10620 

25 ATTTAAACGG CTGGGGAACG TTTCATGATG TAGCAGCTAT GAAACCTACA AATTTAAATT 10680 

TATTTAAAGG GTTATCATTT ATAGGAATTA TCTCTCTATT TTCATGGGGA TTAGGTTATT 10740 

TCGGTCAACC TCATATCATT GTAAGGTTTA TGTCTATTAA ATCACACAAG ATGCTACCTA 10800 

on ^ 

AAGCTAGACG TTTAGGTATT AGCTGGATGG CTGTTGGTTT ATTAGGCGCT GTGGCTGTTG 10860 

GTTTAACAGG TATTGCATTC GTACCTGCTT ATCATATTAA ACTAGAAGAT CCTGAGACAT 10920 

TATTCATCGT GATGAGTCAA GTACTCTTCC ATCCTCTTGT AGGTGGTTTC TTACTTGCTG 10980 

35 

CGATTCTAGC TGCAATTATG AGCACGATTT CTTCACAATT ACTTGTAACA TCTAGTTCAC 11040 

TAACGGAAGA CTTTTATAAA TTAATTCGTG GTGAAGAAAA AGCTAAAACG CACCAAAAAG 11100 

AATTTGTTAT GATTGGAAGA TTATCTGTAT TAGTTGTAGC AATTGTTGCC ATCGCGATTG 11160 

40 

CATGGAATCC AAACGACACA ATTCTAAACT TAGTAGGTAA CGCTTGGGCC GGATTTGGTG 11220 

CATCGTTCAG TCCACTTGTG CTATTTGCAC TTTACTGGAA AGGTTTGACA CGTGCCGGT6 11280 

^ CTGTAAGTGG AATGGTTTCA GGTGCCTTAG TCGTTATCGT TTGGATTGCA TGGATTAAAC 11340 

CATTGGCACA TATCAACGAA ATATTCGGCT TATATGAAAT TATTCCTGGA TTTATTGTAA 11400 

GTGTAATCGT TACATATGTT GTAAGTAAAC TTACTAAAAA ACCTGGTGCA TTTGTTGAAA 11460 

50 CTGACTTAAA CAAAGTTCGT GACATCGTTA GAGAAAAATA ATTCATAAGT CTTAACAAAT 11520 

TAAAAAGGTA CTAATGTTAA TCAAAATTAT GACTAACATT GGTACCTTTT TATTATCTTT 11580 
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AATTAAAGCA 


CGTGGTTGGT 


TACCATCTTT AATACGAATT 


TCATAGTTAT CGATTTTATC 


11700 




GAAATATTTA 


TTCGCTTGTT 


CAGTAACGTA CTGTGTAATA 


CCAATTGTTT CAGCTTGTCC 


11760 


5 


ATAGTAATCG ATTGGTAAAT CTACTACTAA TCGTTGTGGC TTTTTATCAA CAAATTTAAC 


11820 




TTTCCCTACT GCTTGTGTGA AATTAGAAAA ATATGATTGC AAATTATCAT TAAATTGCTT 


11880 


10 


GAAATTATTA 


TTTAAATTTT 


CATCATAATC TGCTGCTGTT 


GAAGAAGGTA ATAAAGCTGA 


11940 


TTTTTCATTG 


ATATTATGCC 


ATTCATTAAG CTTTGTTTGA 


CTCTTTTCTG CAGTCGCTTG 


12000 




AGTGATAAAT TCACCTGGTG 


TGATTGAATC TTCACTTGAT 


TGCTTATAAA TTGCAAAATG 


12060 


IS 


AATTGGTATA TCTTTTAAAT CATCATTTTC ACX5TAACCTT GATAATATCT CACTAGCCAT 


12120 


TTGTTTACCT 


TGCTTTTTAA 


CTcGCTATCA TCTAGTTTTT 


TACTAAAAGT CGATCCATCT 


12180 






TATAGTAATA AACACTATTC ATAGCTAAAC 


CAATCGTCAT ACCTTTAATA 


12240 


20 


TTCTTACXTTT 


TTGTATCTCC 


ACCACCATAA AAATCTTGCT 


CTAAAATGTT AGATAAATAG 


12300 




GCTGGTGATT 


TTTCTGCAAT 


CTTTTCAGGA TCTGTTTCAC 


CTtCGTGTGA TGGATTAAGT 


12360 




CCTAAATTTT 


CATTCGCTTT 


CTTGTCTTTT TTATCTTTTT 


CAGACATTTT ATCGATTTCA 


12420 


25 


CGTTTTGTAT 


ACTTAGGATT 


TAAATAGGCA TTAATTGTTT 


TCTTGTCCAA AAATTGACCA 


12480 




TCTTGATACA 


AATATTTATC 


TGTTGGAAAT ACTTCTTTAC 


TTAAGTTCAA TAAACCATCT 


12540 




TCAAAGTCGC 


CGCCATTATA 


ACTATTTGCC ATGTTATCTT 


GTAAAAGTCC TCTTGCCTGG 


12600 


30 


CTTTCTTTAA 


ATGGTAACAA 


TGTACGATAG TTATCACCTT 


GTACATTTTT ATCCGTTGCA 


12660 




ATTTCTTTTA 


CTTGATTTGA 


ACTATTGTTA TGTTTTTGAT 


TATCTTTTCC AGCCTGGTCA 


12720 


35 


TCCTTATGGT 


TACCACAAGC 


AGCGAGTATA AAGATAGCTG 


TAATCAATAA TACTAATGTA 


12780 


CGCTTCATCG 


ACATACCCCT 


CTAACTATTT AATTCATTTT 


GCTTATCTAC AAATTGTTGC 


12840 




TCTGTCCAAA 


TTTCAATACC 


TAAACTTTGT GCTTTTGTTA 


ATTTTGAACC TGCATCTTCA 


12900 


40 


CCAGCAATAA 


CGACATCTGT ATTTTTAGTA ACGCTACTTG 


TAACTTTAGC ACCTTGTGAT 


12960 


GCAAGCCATT 


TAGATGCTTC ATTGCGTGTC ATTTGATGTA GCTTACCAGT CAGTACTATC 


13020 




GTTTTACCAC 


TAAATTCAGG 


ATGTCCTTCA ATATCTGATG 


TTTTGATACC TTTATAAATC 


13080 


45 


ATATTAACAT 


GTTTATCTTT 


TAATTTTTGA ATTAAAGCAC 


GAATATCTTC ATTTTCTAAA 


13140 




TAAGTAACTA 


CAQATTGTGC 


TACTTTATCA CCTATATCAT GAATTTCTAC TAATTCCGCT 


13200 




TCAGTTACCG 


TTAGTAATCG 


ATCTATOGTT TCATATTTTT 


CTGCTAACAC TTGGCTCGCT 


13260 


SO 


TTAACACCTA AATGCCTAAT ACCTAGACCA AATAATAAAT TTTCTAAAGA 6TTGTCCTTA 


13320 




GCTTGTTGAA TGGCAGCTAA 


TAAATTATCA ACTTTTTTCT 


GCCCCATTCT GTCTAAAGGT 


13380 



S5 



783 



EP 0 786 519 A2 



TAAAGCTGTT GAATAATTTT AGTGCCTAAA CCATCAATAT TcATGGCTTG TCTTGaTACA 13500 

AAGTGnATCa ATCCtTcAAC AAGTTGTGCT TGGTCATTTT GG 13S42 
^ (2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 93 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

CAGTAAACAC CTCTGATTAC GAATATTTAT ACATTTATTT TAACACATGC ACTGATTTAC 60 

GACTACTAAA CACCTTTACG TAAAAAGGGT AAACATGGTT TATCTATCTT GGTTATCTAT 120 

20 TTATAAATAT TTnTCATATT ACGCATAACA ATTGCTTAAA ATATGTATAA AAATGAATAT 180 

ATGTGTAATA AACTT6CTAA TTATTAGATT TAATAAGCGT CAATTGTTTG AACATATTtA 240 

ATTAAAATCA CATTGATATC ACAGATACGA ATATTGTCGT ATAGAAATTG AAAATTCTAT 300 

2S TTTTTAAATG AAAGTCTTCA ACATAATTTT AAGTTTCAAC ATGAGAAAAA TCGATTAACA 360 

AACAACGTCA GTTGAATATG CC T TTT G AGA CATTTCAAAC TTTACAATTG TTGCTAATCG 420 

ATATATTTGC TTTTAGTGAT CCCTGCTATA AAATAAATCA ACGATTTCTA ATAAGTGTTT 480 

30 

TGTATTGAAT TGTTCATCAA TTTGCGTTAG TTCATCCACT GCTGCGTCTC TATGATAAGT 540 

CAATTTATCT TCTGCGCCAT CTTTCCCTAA TAAACTCACG TACGTACTTT TATTATTTTC 600 

AAGATCGCTG CCCACTTTTT TACCTAACTT TGCTTCATCA CCATAGCAGT CTAATAAATC 660 

35 

ATCTTTAATC TGGAACATCA TACCTAAATG ATAACTATAA CTTTCTAAAT GTTCTTTAGT 720 

TGTiCrCATCG ACATTAGCGA TATCTGCTGC ACTCATAACC GCAAAAGTTA ATAATGCTCC 780 

TGTTTTTGTT TTGTGTATCA TTTCCAAAGT TTCAAGATCA ATTGGTTGGC CTTCGCTTTG 840 

40 

CATATCTAAC ATTTGACCGC CGACCATTCC AACATGACCA CTTGCTATTG ACAGCCGTTG 900 

TAGAACTTTT ATTTTTACTT CATCAGTTAA TCTATCATCA CTTGAAATAA GTTCAAATGC 960 

45 TTTAGTTAAT AAAGCATCAC CTGCTAATAT CGCAGTCCAC TCACCATATA CTTTATGATT 1020 

TGTTAATTTT CCTCGTOGAT AATCATCATT ATCCATCGCT GGTAGGTCAT CATGAATAAG 1080 

TGAATATGTA TGAATCATTT CTAGTGCAAT TGCGCTCTTC ATACCTAACT CATACTCGGT 1140 

SO ATTTAGTGAA TCTAAAGTGA GTAATAACAG AACTGGTCGG ATGCGTTTAC CTCCAGCATT 1200 

TAATGAATAC AACATACTTT CTTCTAGCTG AGTATCCATT ACTGATTTAT TTATCGCAAC 1260 
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CATCCTCAGC TTCTTCTTTT ATTAAGTCAT TCACCTTTTT TTCGGCATTT TTTAAAGTTG 1380 

TGTCACAAGC TGCTGATAGT TTCATACCAC GTT6ATATAA ATCTAATGAT TCCTCTAAA6 1440 

^ ATACTGTTTC ATTATCTAAT TTTTGAACAA TTTGCTCTAA TTCTTGCATC ATTTCTTCAA 1500 

AACTTTGCGT TTCTTTAGTC ATTATTACAC CTTACTTTCG TAACTTTTGC ATCTACTAAG 1560 

CCATCTTTCA TTGTTAACGT CAATTGATCA TTTTCTGTTA AATCTTTAGT ACTCGTAATG 1620 

10 

ACTTCGTCTT TTTTATTAAC AATTGCATAT CCACGCAACA TTGTATTAGT TGGACTTAAA 16 BO 

TTGTTTAAGT TTTCTACTTT ATTTTTCAAA TCATTTTTAT AACTTAATAT CTTAGAATTC 1740 

AATAATrrAA CAAGTTGGTT TGTCAATTGA AGATTATnTT GTT6TTCTTG ATTAACACTA 1800 

75 

CTTAGTAATG CTTTTAAATn ATAACGTTGG TGCAACAGCA TTAAATCGAG GCCCCGGTGG 1860 

TCCAAAGTTG CCCGAATTnG TGGTTTCAGG CCC 1893 
20 (2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 821 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : do\ible 
25 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

AAAATATATT CCTTCACTTA ATATTCAATT AGAGAAAAAC ATGGT6ATTG TAATATGTTG 60 

TGCAATATTT CTGGGTGTTT TAATACTTTT TTTATTTCTG AATCGTAAGC TAAGGTTGGA 120 

AATTTATAAT AATAACTCTA GTAAAGGGAA AATAATTTTA TTTCCTTCAT TAAAAAACTT 180 

TTGTTTCACA ATATTTTATT ATTTTTTATT TGGCGGTCTT TCAATAATGG CTCTAAGTAT 24 0 

GTTATTAACT TTAAATCCTC AAAATATAAT AGGCTTTATT GGTTGGTTGG TAATGACTGC 300 

AGGTTTCTTT CT6TTAAACA TGTCATCGAT TATTGACAAA AAAATTTATG TATTATCTAA 360 

AACTAACACG GTGGAAAAAT GATGGTTTAG CTGGATTTAC TGCAGGTTCT ATTTCGGCAA 420 

TACTTGTATA TTGGACCAAT CAAAAAAATG AATTTOGAAT AAAAGATAAA AACGATTGGA 480 

^ TAGGACATAA ACTAGACGTT GGTATAGATG CTGTAGAAAA ATCTGCA6AA AAAACAGTAG 540 

ATGGTGTTGA AAATGTCATG GTGAAGCTTC AAAAAGTATT TCTAATCATA TAAGCCCTAA 600 

QAAATGGAGC TGGTAAATGT TGCTATGCGA ATCTAAAATC ATCAATAAAA ACCCAAAATA 660 

SO TAGAATTATT AAATATAATG ATGAATACTT AATGGTCGAT ATAATAAGCA CTTGGATTAG 720 

TTTATTTTTT CCTTTTATTA ATTGGTTCAT CCCaAAAGaA TACGTCAAAA TTAGTAGAGA 780 



55 



785 



EP 0 786 519 A2 

(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2343 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 





AGTAAGATAA 


TTTTCAATTA 


GAAAATATCT 


TACTGCTGTT 


CTCTATTTAT 


ACAATACTTC 


60 


IS 


GTATTGAATG 


GcTTCGCTTT 


CCTAGGGTGC 


CGTCTCAGCC 


TTGGTCTTCG 


ACTGGCACTG 


120 




CTCCCTCAGG 


AGTCTCGCCA 


TTAATACTAC 


GTATTAACAT 


GTAATTTTAC 


TTTGAAATAC 


IBO 




TTTTAAAAAA 


TAAGACACTT 


TGCCCAACTT 


GCACATAAAT 


GTAAAATTCA 


ATAAAATGAA 


240 


20 


TTTTCTGTGT 


TGGGTCCCTT 


CTTATAATTT 


AATAAATACC 


ACTAAACTAA 


ATTAACGAGG 


300 




TGCdTATGT 


ATAAAAATTA 


TAACATGCCC 


CAACTACACT 


ACCAATAGAA 


ACTTCTGTTA 


360 




GAATCCCTCA 


AAATGATATT 


TCACGATATG 


TIAATGAAAT 


TGTTGAAACr 


ATACCTGATA 


420 


25 


GCGAATTCGA 


TGAAXTCAGA 


CATCATCGTG 


GCGCAACATC 


CTATCATCCA 


AAAATGATGT 


480 




TAAAAATCAT 


CTTATATGCA 


TATACTCAAT 


CTGTTTAATT 


ATGTTCAAAG 


CATTAAGGTA 


540 




ACAAGACAAT 


ATCTAAGATA 


TCAAAGATAG 


AAATTTTTTG 


ACGTTGTTGC 


TGATTGTAAA 


600 


30 


CATAACCATC 


AATTTCATAA 


TTAATAGCAT 


CAATACGATA 


AATGGTTAAG 


CGTACTGAAT 


660 




CTACAAAGCC 


ATTATTATAA 


AATTTAACTT 


CTACAGGTTG 


GGCATATTGT 


AGCGCCTCGT 


720 


35 


GTAGCCGAAT 


GTTTAGCTCA 


GCCAATTGAT 


CATCTGATAA 


TACAGGACGT 


GTAATTTTGT 


780 


TTTGGTCGAT 


AATGTATTGT 


TGAATCGTTT 


CGAATTGTTC 


GGGTAATGTT 


GCAAAAGGAG 


840 




CCCStTTAAT 


CATGCCTCTT 


CCCATAGGTA 


TATTGTTATC 


TAGTAATTCT 


CTTGGAACGT 


900 


40 


TACGATAATC 


AGTTTCTTCT 


TCATAACTTG 


TCATCCTTAA 


TTCACCCCAA 


TCTGATAATT 


960 




ACATTATACG 


AACATGTGTT 


CTATTTTGCA 


ACAAAAATTT 


TGTGGaAGCA 


TAAACGOGTT 


1020 




AATAATTAAT 


GCTCGTGtAA 


GTAAAAAAGA 


GGGATTAATT 


AAAATCGAAT 


AATGaCATAT 


1080 


45 


CACaGCAAAT 


AGTTCTTTTA 


AAGTAGTTAA 


ATAGTTTTAG 


CTTTAAGGAA 


aTGATAAaTG 


1140 




ATTGTwAATT 


CTAGCTAAAA 


TTTAATAAAA 


TGAAAATAAG 


ACTAACATGG 


AGGGGTAAAA 


1200 




GTAATGACAA 


ATGGATATAT 


TGGTTCTTAC 


ACTAAAAAGA 


ATGGTAAAGG 


GATTTATCGT 


1260 


SO 


TTTGAATTAA 


ACGAAAATCA 


GTCACGTATT 


GATTTATTAG 


AAACAGGATT 


TGAATTAGAA 


1320 




GCGTCTACAT 


ATTTGGTGCG 


TAATAATGAA 


GTTTTATATG 


GAATCAACAA 


AGAAGGAGAA 


1380 
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TGTTTGTCTT CAAAAGCTGG TACAGGTTGT TATGTATCGA TTTCAGAAGA TAAACGATAT 1500 

TTATTTGAAG CGGTATATGG TGCTGGCATC ATACGTATGT ATGAATTAAA TACGCACACA 1560 

GGTGAAATTA TACX3TCTAAT TCAAGAACTT GCACATGATT TTCCAACAGG TACACATGAA 1620 

AGACAAGATC ATCCACACGC ACATTATATT AATCAAACTC CAGATGGTAA GTACGTTGCA 1680 

GTAACAGATT TAGGTGCTGA TCGTATCGTT ACTTATAAAT TTGATGACAA CGGGTTTGAA 174 0 

TTTTATAAAG AATCTTTATT TAAAGATAGT GATGGGACAA GACATATTGA ATTTCATGAT 1800 

AATGGAAAAT TTGCTTATGT CGTACACQAA TTATCAAATA CTGTGAGTGT TGCAGAATAT 1860 

AAT6ACGGTA AATTTGAAGA GCTCGAGCGT CATTTAACAA TTCCTGAAAA CTTTGATGGA 1920 

GATACTAAAC TTGcAGCAGT GCGTTTATCT CATGaTCAAC AATTCTTATA TGTATCTAAT 1980 

AGAGGGCATG ATAGCATTGC AATTTTTAAA GTTCTTGATA ATGGTCAACA CTTAGAACTA 2040 

GTAACAaTTA CTGAAaGTGG TGGTCAATTC CCAAGAGATT TTAATATTGC CTCATCAGAT 2100 

GACCyTTTAG TTTgTGCTCA kGaGCaAGGA GATTCAGTTG TAACTGTTTT CGAAAGAAAT 2160 

AAAGAAACAG GTAAAATTAC GCTATGTGAT AACACTCGTG TAGCATCTGA AGGTGTATGT 2220 

25 GTCATATTTT AATCTTTAAT TAATCATQAT AAAAAGAAAA CCATGTTTCC AAAAAATITG 2280 

TGTATACCTT GAAATTTATT GnTTTCCAGn ACATCAATTA TGGGAAGCAT GGnTTATTTT 2340 

TGT 2343 
(2) INFORMATION FOR SEQ ID NO: 158: 



10 



15 



20 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4837 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

40 

AAATTGCCAG TTGGTATCGC TTCTGGTGCA GTAGTCGAAG GTTTCTTCCA AGGTATCATT 60 

CCGATTGGCT ATATCGTTAT GATGGCAGTA TTGTTATACA AAATTACTGT TGAATCTGGA 120 

45 CAATTTTTAA CAATTCAAGA TAGTATTACA AATATTTCAC AAGACCAACG TATTCAAGTT 180 

TTACTTATTG GATTTGCATT CAACGCATTT TTAGAAGGTG CAGCAGGATT TGGTGTACCA 240 

ATTGCAATTT GTGCACTTTT ATTAACACAA TTAGGATTTA ATCCATTAAA AGCTGCGATG 300 

SO TTATGTTTAG TCGCAAATGC AGCGTCTGGT GCTTTTGGTG CGATTGGTAT CCCTGTAGQT 360 

GTTGTAGAAA CGTTGAAATT ACCTGGAGAT GTTTCAGTAT TAGGT6TTTC TCAATCAGCA 420 
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GGTACCGTTA ATGATTGAAG GTCGTAAGTC TGATGAACCA ATTGCTTTAA CTTATGATGA 2340 

AACTGGTACa gATGTTAACT TTGGTGCGTT AACTGCAAAG TTATTTGATA ATTTAGAGCA 2400 

^ ACGTGGTGTG GQAATTCAAT ATAAGCAGAA TGTATTAGAC ATCAAGAAAC AGAAATCTGG 2460 

GGTATGGCTA GTTAAAGTTA AAGATTTAGA AACTAATGAA ACGACAACAT ATGAATCTGA 2520 

TTTTGTATTT ATTGGTGCTG GCGGTGCGAG TTTACCATTA CTCCAAAAGA CTGGGATTAA 2580 

70 

ACAATCAAAA CATATTGGTG GTTTCCCGGT AAGTGGATTA TTCCTGCGCT GTACAAATCA 2640 

AGAAGTGATT GATCGTCATC ATGCTAAAGT GTACGGAAAA GCAGCAGTGG GTGCGCCACC 2700 

AATGTCAGTG CCGCACTTAG ATACACGTTT TGTAGACXXK: AAGOGTTCAT TGTTATTTGG 2760 

75 

TCCATTTGCA GGTTTCTCAC CTAAATTTTT AAAAACAGGT TCACATATGG ATTTAATTAA 2820 

ATCQGTTAAA CCAAATAATA TCGTGAOGAT GTTATCTGCA GGTATCAAAG AAATGAGTCT 2880 

TACGAAGTAT TTAGTGTCAC AATTGATGTT ATCTAATGAT GAGCGTATGG ATGATTTAAG 2940 

AGTCTTTTTC CCAAATGCTA AAAATGAAGA TTGGGAAGTG ATTACAGCAG GGCAACGTGT 3000 

CCAAGTAATC AAGGATACTG AGGATTCTAA AGGTAACTTA CAATTTGGTA CTGAAGTTAT 3060 

25 TACCTCAGAT GATGGCACAT TAGCTGCATT ACTTGGTGCA TCACCTGGTG CGTCAACAGC 3120 

TGTAGATATT ATGTTTGATG TTTTACAGAG ATGCTATCGT GATGAATTCA AAGGATGGGA 3180 

ACCAAAGATT AAAGAAATGG TGCCGTCATT TGGTTATCGC tTAACAGATC ATGAGGATTT 3240 

^ ATATCATAAA ATTAATGAAG AAGTAACTAA GTATTTACAA GTTAAATAAT AAACGAAACG 3300 

GTAATGTCTT TTTTAATGTG ATAGACATTA CCGTTTTTTA GTGGTTAATA AAAATCATTT 3360 

TAATTGTTTC AGTTGCTTGT TAATAGTGTC TACGTAGTTC TTGTTTTTAA AGAATTGAAT 3420 

35 

TATCCAAATT AATACATAAA CCACAATGAA GATAATTGTG AATATGATTA GATAATGCAC 3480 

TGTliGTGGA AACCAACCGG CAAGCATTGC TAAAGGCAAG AATCCGACAT ACGTTGTTAT 3540 

GAAATGCATT ATAGTTGCTT TAGTAATGCT CCAATCTGTG TATTTAAAGA TAAAATCTCC 3600 

40 

AAGGAAAAAG ACGACGCCTA TGAGTAACCA TAAAATGATA GAAATCAACA TTACGGTAGT 3660 

TTCTGTGAAA TGCGTATAAT ACAATATGCC AATAGTTGAT TGTGGGTTCA GTGGATAATA 3720 

^ TTTGCCGTCT GCAAATAACA TACTAAAGAA CAGTGAAAGG GACAAACCAA TGATTAAGCT 3780 

AATAAATAAT GAGTTTTTCA AATTTTTCAT ATTGATAAGC GCTCCTTTAT AGATTTTAAA 384 0 

TAAOGTCTAG AAQAATAGGT GTAGTGTGCA TCTTTAAGAT ACATACGTAT AA6TCCATTT 3900 

SO GGCTCTAATA ATAATTTTTC AATOTAATAC TTGTTGACQA TTTCTGATTT GGAAATGCGA 3960 

ATGAAATGTT GTGGTAACTG TTTTTCTAGT TCATAAAGTC GTAATTTTAG TTTGAATTTT 4020 
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ACATTAATGA TATGGATTTC TTTGTCTATG TATCCGACTA ATGTATGTGA TTTGTCTAAA 4140 

TCATTGACTG CATTAATAAT ACTTTGAACG TTATCATTCA TTTTAGGTGC ATGTATATCA 4200 

^ ATATAAGATT CCGTCTCATT TGCATTGATA AATAAATTGA GTTTCATCAT AGGTTAATGC 4260 

CTCCTTCAAA ATTATTAAAC CATAAATGAC CATCGATATA TTTAAATTTT GTTGAATGGT 4320 

AGAAATTAAA TGTTAAGTGG CTAGAAAGCG CTAATCAATA TAAAAGATAC CTCCTGAAAT 4380 

10 

AAAAACAGAA ATGTTTTTTC AGGAGGTAGA GATTAAAGTG AATTATTTGG CAGTGTAATA 4440 

GTAAAGGTGG TTACATACTC GTTACTTTGT GTGAATTGGA TTGTACCATG ATGCAATTCA 4500 

ATGATGGATT TTGTAATTGC AAGACCTAAA CCATTGCTAT TATCATGTTT GCTCACTTTA 4560 

IS 

TAAAAACGTT CAAATAAACG TGCTTCAGCT TGTGGACTAA TTGGTGAACC ATCATTACTT 4620 

ATTGTGAAAA TGATATTGTT GTGACTATGT TGCAAAGCGA TGTCAATGGC ACCACCAACA 4680 

20 TCTOTATACT TAATAGCATT TATTAATAAA TTACTCAATG CTTGATGTAA CAAACGTTGA 4740 

TTTCCTAGGA AATTGATGAT TCTAGGTCAG CTAAnATGAT TAACGACTTT TCATCAGCAG 4800 

CAnATTGTTC ATGTCGAATG ATATCnTTAA TGAGCTG 4837 

25 (2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1600 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

35 

ACAATTATTG GATTATTATC AAGCAACGTT AATGGATGAC TTCCACTTAC AACAGAAATG 60 

CCCATAGATT CTAAATCTtT TGCATGAGCA TCTTGTGATA A6TCTTTTCC ATCATTGACA 120 

GTTACATTCG CACCTAATTT ACTTAATAAT TTAGCTGCTT CATAACCACT TTTTGCCAAA 180 

40 

CCGACAACTA ATACATTTTT ATTTTCTAAC CCTGTATAAT TAAGCATCTT AATGCACTCC 240 

AATCCATAAA CCGATTAAAC CTGAAATCAG ACCAACAGCC CAAAATACTG TAACTACTTT 300 

4S CCATTCGCTC CATCCTATCA ATTCAAAATG ATGATGAATC GGACTCATTT TAAATATACG 360 

CTTTCCAGTC AATTTAAAGC TAGCGACTTG TAACATAACA GATAATGTTT CAATTACGAA 420 

TACTAAACCT ATAAAAATTA ATGATAATTC CTGATTAAGC ATGATTGAAA TGGTA6CAAA 480 

50 TATACCACCT AAAGCTAAGC TACCTGTATC TCCCATAAAC ACrTTAGCAG GGTTAATGTT 540 

ATATGGTAAA AATCCTAAAA GTGCAAACAA CATAATGATA CAGAAAATAC CAATTGCCGT 600 
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TGTTGCTTTT CCTATTAACA TTATTACCAT TTTTCAATAT 
ATATGTTAAG CAATGCACCC GCTGAAACAT CTACTCTAAT 
^ TAACTCAAAA CTCCAGTGGT GGCTTATTAT CTATCGGTTT 

CTTCAAATGG AATGACTGCA ATTATGAATT CTTTCAATGT 
GCCGTAATGG AATCGTATTA AAACTACTAA GTGTTGTCTT 

70 

TGTTTGTAGT TGCTCTAGCA TTACCAACGC TTGGTTCTGT 
GTCCACTTGG aTTTGACGAA CAAGTGAAAT GGATTTTTAA 
CAATCATTAT TATATTTATC ATATTTATCG TGTTATATTC 

75 

CGAAGCTTAA GTCAGTATTA CCAGGTGCAG TATTTACTTC 
CATTTGGTTT TGGTTGGTAT ATTTCAAATT TTGGTAACTA 
20 TCG06GGTAT CATCATTTTG TTACTATGGT TATATATCAC 
GnGCTGAAAT CAATGCAATC ATTCATCAGC GTAGTGTAAT 
(2) INFORMATION FOR SBQ ID NO: 161: 

25 . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 



35 



40 



45 



SO 



TCTTGAGCCA 


TCTTTTGAGC TAACTGACTA GATTGATACC 


CAAAAATCAT 


AGTTACCAAC 


60 


ATAAACTTTA 


ATTTTACCGA AGTCTAAATC AGCGATATGA 


GTACATACAT 


TATTTAAGAA 


120 


ATGACGGTCA 


TGCGATACTA CGATAACAGT ATTATCAAAG 


TTAATTAAGA AATCTTCTAA 


180 


CCAACTGATT 


GCTGGAATAT CGAGACCGTT AGTAGGCTCA 


TCCAGTAATA 


GTACGTCTGG 


240 


TTCACCGAAT 


AAACTTTGCG CTAATAATAC TTTAATTTTT 


TGGTTGTTTT 


CTAATTCAGC 


300 


CATTTTTTTA 


TCGTGTAAAG TTGGATCGAT ACCTAAACCA 


GATAAAAGGT 


TAGCAGCATC 


-360 


AGCTTCAGCA 


TTCCAACCAT TCATTTCTGC AAATTCACCT 


TCAAGTTCAG 


CAGCACGGAT 


420 


ACCATCTTCA 


TCACTGAAAT CTGGCTTCAT ATAGATTTCA 


TCTTTTTCTT 


TCATAACCTC 


480 


ATAAAGACGT 


TCGTGACCTT TAATTACAAC ATCAAGCACG 


CGTTCATCTT 


CATAAGCATA 


540 


GTGGTCCTGT 


TTTAAAACAG CTAGACGTTC ATTTTTCCCT 


AATGAAACAT 


GTCCTGTTTG 


600 


AGAATCTAAT 


TCACCAGATA ATATTTTTAA GAATGTTGAT 


TTACCTGCAC 


CATTCGCACC 


660 



TAAGCAGAGT CAAATTACTA 540 

TAAGAGTGTA ATTGGTGATA 600 

GATTTTAGCA ATTTGGTCAG 660 

TGCTTACGAT GTAGAAGATA 720 

CACTGTAGTT ATGGGCGTTG 780 

AATTAGTCAT TTCCTATTCG 840 

CCTTATTAGA ATTGTGTTAC 900 

GGTTGCACCT AACGTTAAAA 960 

AATTATTTGG TTAGCTGGTT 1020 

TTCTAAAACA TATGGCAGTA 1080 

AA6TTTTATT ATAATTGTCG 1140 

TAAAGG 1186 
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ATCTCCAAAA CGTAAACTCA CATCAGTTAC TTGTAACATG CATTTTCTCC TTTTTTTCAT 780 

TCGATATTCT AACGGAAGAA TTATATCATA TTATCGTCAC AGTTTCGACC TCATATAAGT 840 

TGTAATGATA GAATGACTCA CACATGTTAT AATAATAAAG AATACAAGAA TCGAAGGAGA 900 

ATAACATGGC ATTAGACAAA GATATAGTAG GTTCTATAGA ATTCCTTGAA GTAGTAGGGT 960 

TACAAGGTTC AACTTACCTT TTAAAAGGAC CAAACGGTGA AAACGTAAAG TTAAACCAAT 1020 

CAGAAATGAA CGATGATGAT GAATTAGAAG TAGGTGAAGA ATATAGTTTC TTCATTTATC 1080 

CAAACCGTTC AGGTGAAITA TTTGCAACTC AAAATATGCC TGATATTACG AAAGATAAAT 1140 

ATGACTTTGC TAAAGTACTT AAAACGGATC GCGATGGGGC ACGTATAGAT GTTGGATTAC 1200 

CCCGTGAAGT GTTAGTACCA TGGGAAGATT TACCAAAAGT GAAATCACTA TGGCCACAAC 1260 

CTGGTGATTA TTTGCTAGTT ACATTACGAA TTGACCGTGA GAATCATATG TATGGACGTT 1320 

TAGCX3AGTGA ATCTGTTGTA GAAAATATGT TTACACCTGT ACACGACGAT AATTTAAAAA 1380 

ACGAaGTCAT TGAAGCCAAA CCTTACCGCG TATTACGAAT TGGTAGCTTT TTATTAAGCG 1440 

AATCAGGTTA CAAAATTTTC GTACATGAAT CAGAACGTAA AGCTGAACCA AGATTAGGTG 1500 

25 AATCTGTTCA AGTTAGAATT ATCGGGCATA ATGATAAAGG TGAGTTAAAT GGTTGATTTT 1560 

TACCACTTGC ACATGAACGT TTAGACGATG ACGGCCAAGT CATCTTTGAT TTACTAGTTG 1620 

AATATGATGG TGAATTACCA TTCTGGGACA AATCAAGCCC TGAAGCGATT AAAGAAGTAT 1680 

TCAATATGAG TAAAGGTTCA TTCAAACGTG CAATCGGTCA CTTATATAAA CAGAAGATTA 1740 

TTAATATAGA AACAGGTAAA ATCGCTTTAA CTAAAAAAGG TTGGAGTCGA ATGGACTCAA 1800 

AAGAATAATC ATTTTTACAC GTGTCGTAGG ATGCGTGTTT TTTTTATTCA ATATTAAATC 1860 

GGACAGATGA AGTAGTTTTT TAAACATTCC TTTCAAAGTA AAAAATTAAA TAATTCAAAC 1920 

GAATAGGCTG GGaCATTAAG TTCTTAGGCA ATGTAAAAAA GCTGATTTCT ATTAATTATT 1980 

TGATGGAAAT O^GCTTTTTT GATATGTATT TTATAATGTA CAGCTCGTTG AGCTGCTATT 2040 

TTCCTTATAT TAAGTGCCAT TAATACAAAA CCTAGCTCTC GTTTAACTTT ATTTATTCCT 2100 

CGAACTGACA TTCGAGTGAA aCCCAAAATA GCCTTCATAA ATCCAAAAAC AGGCTCTACA 2160 

TAAATTTTTC TATGACTATA GATTTTTTTC GTTTCTGGTT CAGAAAGCTT TTGaTTAATT 2220 

TGGGCTTTAA TGTATTTCAA AGTAAAATTA CATGTTAATA CGTAGTATTA ATGGCGAGAC 2280 

TCCTGAGGGA GCAGTGCCAG TCGAAGACAG GGGCCCCAAC ACAGAAGcTG ACATATAGTC 2340 

SO AGCTTACAAC AATGTGCCGG TTGGGGTGGC TGAGACGGCA CCCTAG6AAG GQACCCGTCA 2400 

TCAAAAATTC TATTTATAGA ATTTTACAGT AATGTGACAG ACGGGCAAAG CGAAgCCATT 2460 
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CTTACTGCTG TTTTTTTAGG GATTTATGTC CCAGCCATTT TTGTATTCAT ATTTAAATTT 2580 

CGATAATTTT TCAGGAAGCA TTTTAATTTT ACTAATGAAG CAATATTTTT TAGATTAACA 2640 

^ AAAATTAATA TTTACATTTT CTTAACAATT TTTTATGTAA CATTTACAGT TTCTAAAAAT 2700 

GAGGTTAATA ATTCAAGGTT AAGATAAAGA TGTAATCAAT ACAAATACTA TTTGTTGTTC 2760 

ATACAGGGAG GATATTTCAA TGAAAAAATG GCAATTTGTT GGTACTACAG CTTTAGGTGC 2820 

TO 

AACACTATTA TTAGGTGCTT GTGGTGGCXX3 TAATGGTGGC AGTGGTAATA GTGATTTAAA 2880 

AGGGGAAGCT AAAGGTGATG GCTCATCAAC AGTAGCACCA ATTGTGGAGA AATTAAATGA 2940 

AAAATGGGCT CAAGATCACT CGGATGCTAA AATCTCAGCA GGACAAGCTG GTACAGGTGC 3000 

IS 

TGGTTTCCAA AAATTCATTG CAGGAGATAT CGACTTCGCT GATGCTTCTA QACCAATTAA 3060 

AGATGAAGAG AAGCAAAAAT TACAAGATAA GAATATCAAA TACAAAGAAT TCAAAATTGC 3120 

2Q GCAAGATGGT GTAACGGTTG CTGTAAATAA AGAAAATGAT TTTGTAGATG AATTAGACAA 3180 

ACAGCAATTA AAAGCAATTT ATTCTGGAAA AGCTAAAACA TGGAAA6ATG TTAATAGTAA 3240 

ATGGCCAGAT AAAAAAATAA ATGCTGTATC ACCAAACTCA AGTCATGGTA CTTATGACTT 3300 

2S CTTTGAAAAT GAAGTAATGA ATAAAGAAGA TATTAAAGCA GAAAAAAATG CTGATACAAA 3360 

TGCTATCX5TT TCTTCTGTAA CGAAAAACAA AGAGGGAATC GGATACTTTG GATATAACTT 3420 

CTACGTACAA AATAAAGATA AATTAAAAGA AGTTAAAATC AAAGATGAAA ATGGTAAAGC 3480 

AACAGAGCCT ACGAAAAAAA CAATTcAAGA TAACTCTTAT GCATTAAGTA GACCATTATT 3540 

CATTTATGTA AATGAAAAAG CATTGAAAGA TAATAAAGTA ATGTCAGAAT TTATCAAATT 3600 

CGTCTTAGAA GATAAAGGTA AAGCAGCTGA AGAAGCTGGA TATGTAGCAG CACCAGAGAA 3660 

35 

AACATACAAA TCACAATTAG ATGATTTAAA AGCATTTATT GATAAAAATC AAAAATCAGA 3720 

CGACAAGAAA TCTGATGATA AAAAGTCTGA AGACAAAAAA TAATAAGACG CAATTTCAAA 3780 

TGTGTCTTGA AACATGATTT TGATGGTGAA TCATTATTTA GAGTACAAAG CTTGATTTAT 3840 

40 

CGAGACGCTG ATTTTGACAT TCAGTTAGTC TAcAAGCTTA TCAACTTAAA ATAGTGGTTC 3900 

ATCATTATTT TACAAATCTA ATTATTTTGG GAGTAATAGA AAGAGGTTTG ATTATGACTT 3960 

45 CATCTACTAA TGTTAAAGCT TTAATCGAAA AAAATAATAA TAAAAAAGGA AAGCATAATG 4020 

ACAAAATTAT ACCAGTTATT TTAGCCGCAA TTTCAGCGAT TTCCATTTTA ACAACACTAG 4080 

GTATATTAAT CACATTGCTT TTAGAAACCA TCACTTTTTT CACCA6AATT CCAATAACTG 4140 

50 AATTTCTATT TTCTACTACT TGGAATCCTA CCGGTTCAGA CCCTAAGTTT GGTATCTGGG 4200 

CATTGATAAT AGGGACTTTA AAAATCACAG TTATTGCGAC TATATTTGCA GTTCCAGTCG 4260 
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AACCGATATT AGAAATTTTA GCAGGAATAC CAACAATTGT GTTTGGTTTC TTTGCATTAA 4380 

CCTTTGTTAC ACCAGTATTA AGATCTTTCA TACCAGGTCT TGGAGAGTTT AATGCTATAA 4440 

5 GTCCCXMCTT AGTTGTCGGT ATTATGATTG TCCCTCTCAT CACAAGTTTG AGTGAGGaTG 4500 

CAATGGCATC TGTACCAAAT AAAATTCGAG AAGGTGCCTA TGGACTTGGA GCAACTAAAT 4560 

TAGAAGTAGC AACTAAAGTC GTACTTCCCG CAGCAACATC AGGTATTGTA GCTTCAATCG 4620 

70 

TTCTCGCGAT TTCAAGAGCA ATTGGAGAAA CGATGATTGT ATCATTAGCG GCAGGTAGTT 4 680 

CGCCAACAGC TTCATTAAGT TTAACAAGTT CGATTCAAAC AATGACTGGA TATATTGTTG 4740 

AGATAGCGAC AGGTGATGCA ACATTTGGAT CAAATATTTA TTACAGTATT TATGCTGTAG 4800 

15 

GGTTCACACT ATTTATCTTT ACCTTAATCA TGAATTTACT TTCTCAGTGG ATTTCTAAGC 4860 

GTTTTAGGGA GGAGTATTAA TATGGAAACG ACAGATAATA ATAGACAATC ACTCGTCGAT 4920 

2^ CAACAACTTG TCCAAAAACA TTTATCATCC AGAACGGTTA AAAATAAAGT GTTCAAACTC 4980 

ATATTTTTAG CATGTACATT ATTAGGACTT GTCGTACTTA TTGCGTTGTT AACTCAAACA 5040 

TTGATTAAAG GGGTAAGTCA TTTAAATTTA CAGTTTTTCA CTAATTTTTC TTCTTCAACA 5100 

25 CCATCTATGG cTGGCGTTAA AGGCGCGTTA ATCGGTTCAC TTTGGTTAAT GTTAAGTATC 5160 

ATTCCATTAT CAATCATCCT AGGAATAGGT ACAGCTATAT ACTTAGAAGA ATATGCGAAA 5220 

AACAACAAAT TTACTCAGTT TGTTAAAATC AGTATTTCCA ATTTAGCTGG TGTACCATCA 5280 

^ GTTGTATTTG GGTTATTAGG TTATACTTTG TTCGTTGGTG GTGCAGGGAT TGAAGCCTTG 5340 

AAAATGGGTA ACAGTATATT GGCAGCAGCG CTAACAATGA CCTTACTGAT ATTACCAATT 54 00 

ATTATTGTTT CAAGTCAGGA AGCAATTAGA GCTGTACCTA ACTCAGTACG CGAACTTcTT 54 60 

35 

ACGGCTTAGG TGCTAATAAA TGGCAAACGA TAAGACGTGT TGTCTTACCA GCAGCGTTAC 5520 

CTG6TATTTT AACTGGATTC ATTTTGTCTC TTTCAAGAGC ACTGGGAGAA ACAGCGCCAC 5580 

TTGTGCTAAT CGGTATACCG ACTATATTAT TGGCAACACC TAGAAGTATA TTGGATCAAT 5640 

40 

TTTCAGCATT ACCTATCCAA ATATTTACTT GGGCGAAAAT GCCTCAAGAA GAATTCCAGA 5700 

ATGTTGCATC GGCAGGCATT ATCX3TTTTAC TAGTTATCTT AATCTTAATG AATGGCX3TTG 5760 

^ CGATTATTTT ACGTAACAAA TTTAGTAAAA AATTCTAATT TAAACAATCA ATCTCATTTA 5820 

TCTATTAAAA AGGGAGTTTT AAATATGGCG CAAACACTTG CACAAACTAA ACAAATATCT 5880 

CAAAGTCATA CGTTTGATGT CTCACAAAGT CATCATAAAA CACCAGATGA TACAAACTCA 5940 

SO CATTCTGTTA TATATTCAAC ACAAAATTTA GACTTATGGT ATOGCGAAAA TCATGCATTA 6000 

CAAAATATTA ATTTAGATAT TTATGAAAAC CAAATTACTG CCATTATAGG TCCATCTGGT 6060 
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AAAACAGCTG GTAAAATATT ATATCGAGAT CAAGACATTT TTGATCAAAA ATATTCTAAA 6180 

GAACAATTAC GTACAAATGT GGGCATGGTC TTTCAACAAC CTAATCCATT TCCAAAATCA 6240 

5 ATATACGATA ATATTACTTA CGGTCCAAAG ATTCACGGTA TTAAAAATAA AAAAGTTCTT 6300 

GATGAAATCG TTGAGAAATC ATTACGTGGC GCTGCAATTT GGGATGAATT AAAGGATAGG 6360 

TtGCACACAA ATGCATATAG TTTATCCGGT GGGCAACAAC AACGTGTTTG TATCGCGCGT 6420 

10 

TGTTTAGCAA TTGAACCTGA AGTCATTTTA ATGGATGAAC CGACATCAGC ATTAGATCCA 64 80 

ATCTCAACAT TAAGAGTAGA AGAGTTGGTT CAAGAACTAA AAGAAAAGTA TACAATTATT 6540 

ATGGTtACAC ATAATATGCA ACAAGCAGCT CX5TGTATCAG ATAAAACTGC ATTTTTCTTA 6600 

IS 

AATGGTTATG TCAATGAATA TGATGATACT GATAAAATTT TCTCTAACCC ATCAAACAAG 6660 

AAAACAGAAG ATTATATTTC AGGAAGGTTT GGTTGATATA TAATGGCAAT AATTAGACAA 6720 

CGATATCAGG AGCAACTTGA TGATTTAATA AAAGAATTAC GTCGGTTAGG TGCaAATGTC 6780 

TATGTGAGTA TTGaAAATGG TATAAAAtCA TTAAGTATTG aCGATAGAGG cTTTGCACGA 6840 

CAAACAGTTA AAAACGATAA ACATATCAAT CAATTAAATT ATGATATTAA TGAGCGAGTT 6900 

25 ATCATGTTAA TTACAAAGCA ACA6CCCATT GCGAGTGATT T6CGTATGAT 6ATTTCTTCA 6960 

TTAAAAATCG CCTCCGATTT AGAAAGAATA GGAGATAATG CCTCGAGTAT TGCCAATATT 7020 

CGATTGCX5TA CAAAGATTAC AGATGATTAT GTGTTAACCC GTTTAAAGAC AATGGGTAAA 7080 

30 TTAGCTATGT TAATGTTAAA GGACTTAGAT CAAGCATTTA AAAAGAAAGA TACCGTATTA 7140 

ATAAGAGAAA TAATTGAGCG TGATGAAGAT ATCGATGACT TATATAGTCA TATTATTAAC 7200 

GCAACGTATC TTATTGATAA CGtCCATTTG TCGCTGCACA AGCTCATTTA GCAGCAAGAC 7260 

ATTTAGAACG TATTGGTGAT CATATTATTA ACATCGCTGA AAGTGTTTAT TTTTATTTAA 7320 

CAGOTACACA TTACGAACAA TAACTTAAAG TTATTACTAT AAAATCCCTT ACX3ATAAATA 7380 

TATATTTCTA TTATTGATAA ACCCTCAAAA AAACCAAGAT TCTCACAATT AGTAATGTGA 7440 

40 

AAATCTTGGT TTATATTGTT CTACTATAAA TTGTCTCGCA TCTTAGTTAT TTGCTTGCTC 7500 

AATTTCATCT GTTAATTTTT CAACTTCATC GACTAAATCA GAAATATATT GAATTGTAGA 7560 

TTTAAGTGGC TGTTCTGTAG TAATGTCTAC ACCTGCAATG TTTGCAAGTT CGACAGGTGA 7620 

45 

TACACTACCA CCTTTTTTCA ATGTTTCTAA CCAAGCATCA ACAGCTGGTT GGCCTTCATT 7680 

TTTAATCTTT TGAGAAACGA CAGTTCCGAT TGTTAAGCCA GCAGAATACG TATACGAATA 7740 

SO TAATCCCATA TAGTAATGAG GTTGACGCAT CCATGTTAAT TCAGCACCCT CAGTCATGTC 7800 

TACTGCATCT CCAAAAAATT GTTTATAAAC ATTTAGCATT ATTTCATTTA ATGTnCGGCG 7860 
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(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 798 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 





TTTTTTCTTT 


TCTTCATTTG 


AAAATTGATC 


ATTCAGCAAT ATAAGCGTAT TT6TTAATGA 


60 


iC 
10 


TTTAGGTGTT 


CCAATTTCAT 


AATCCCACCA 


ATTTAAGTTG 


GTATTCTTGC 


CAGTTGTTTT 


120 




AGTAAAATTC 


TCACTTAATT 


CTTTTACTTT 


TTTATCTG6T 


TCTTTTCCAT ATGCATTTTT 


180 




ATGCAGCCAC 


TCAAGGGCAT 


^ X X A XXX 


CTTCTTATTT 


TCGTCAGTAT 


TTAAAGTGGT 


240 


20 


TTTAGGATTC 


CTCATCGCTT 


CTGCGATTTT 


CTCAATATTA 


CGATAGGTAC 


GAGTCATATG 


300 




AGAAGAATTA 


GTTTCAAGGG 


TTTCCGCTCC 


TGACCACAAG 


TATTTCCTAC 


CACTTTCAGT 


360 




TTTCATTTCC 


TTGAGTAAAT 


TCGTCGCCTC 


TTTCTCTGTA 


GCATCAAACT 


TCTTCTTCAT 


420 


25 


ATCTGGATTA 


TTCTCATCAT 


ACTTATCATA 


ACCATAGTTA AOGTCCAGCC ATGTGTTCCT 


480 




CAATTTTTCA 


TAATCTGGCG 


TTTGAACATT 


CGTATCAGCC 


ACAGCGATTT 


GATGTTTATC 


540 




AACACTTCTG 


AATTCACCAC 


CATTCAAAGT 


AATCACACCA 


GCCATTAATA 


ACGTAATGGT 


600 


30 


GGATAATTTT 


TGCCATTTCT 


TTATTCTATA 


TGTCATTGaC 


ATGTCTCCTT 


TTTGTGTTGC 


660 




GCGTGCGCAA 


TGAATATTAT 


GATTAAATAA 


TGATTCAATT 


TTTCAAAATT 


CGTTAACGTA 


720 


35 


TACAAATGAC 


TGTCTACTGT 


CAAACAATCC 


ACAAAGAATG 


TTGATGtCAT 


ATaAACAATC 


780 


GATCACCCAA 


ATTTTCCG 










798 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 
TACAGGTTTT ACTATAATGG ATGGTATTTT GGCTAAACGA CATTGGTTTA GTCTTCTTTT 60 
TTTnACTTCC TAnATTTACA ATGGTATAAA TAATAATQCT ATATTTAGAA TGATGAGTAT 120 
ACTTACTGAA ACTAAATTAA AAGTGTCTGG TTCTTTACTA AAGATAGCTG CTATCCTTGC 180 
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AATACAAGTT CCAATGAGCX5 CAATTAAAAG TACTAACCCA ACGATGAAAC TCTGTTTGTC 300 

ACTTAACTCA AAGAAACTAT AGATAGGATA TTTTTTAATA ATCAAGCCAC CTAAAATCAT 360 

CCATAAAAAT ACGATAATTC CATAAGTCAC ATTTATAACA TACGTTATTT TTTGGTCACC 420 

AAATCGGACT AATGTATTTC GTAGAATCAG CATACCAATG ACAACACCTA AAATAACGAT 480 

ACTAGCTATA TAAAGTAAAA ATGCAATTGT CACATCAAAT GTACCCAAAT CTAAAAACCT 540 

AGGAATTAyA AyGACTGCTA AAATAATJ^GC GAAGyACAAA GTAATATAkT TATACAAACC 600 

GGTAGTAAGA CTTATCTCAG GTGATAATTG ATCAGCCATT GACTTAATCG GTGTATTAAT 660 

AATTGAACTT GTATCTTCGT TATTTTTTTC AGCCATAGTT AAATGATCTT CGAGCTCTTC 720 

CAATAACTCT TCTACTTCTG CTTCAGTCTT ACCTCTAAAT AACAATTCAA CACGTAATTT 780 

TTCTAAAAAA TCTTGAGATT GTTTACTTAA CATCGTTTTC CCCTCCAAAC AAGTTAATCA 840 

TCCCTTTATT CAAAACTTGC CATTTCGATT TAAATACTTT TAGTTCCTTT AAACCTGAAT 900 

CGGTAATCGT ATAGTATTTC CGCCTCGGGC CGCCATTACT AGATTTTTTT ATTGTCGTAT 960 

CAAC6TATCC TTTTTTGTTT AAACGCATTA AAACTGGATA AATACTACCC TCACTTATCT 1020 

25 CTGGAAACTC TTGATTCTTA AGTTTCGTCA TAATTTCATA TCCATACGTT TCGCCTTGGG 1080 

CAATGAGACC TAATATCGCC CCATCTAAGA GACCTTTCAT AATCTGATCT GACACTGACA 1140 

TTTTAATCAC CTACTATCTT ACATAATAAG ATAGTACATT GAGAACTTTT CGTCAACTAT 1200 

30 CTTTTATTGT AAGGTAGTT6 TTGTACACAT TCCTTAAATG ACTAACAACT TTGTTAATAG 1260 

GGTAATACTT ACGGAAGTAT ATTTTATTTA TGGGGGAGGA ATTAATAATG ACTACAAAAA 1320 

CAGTATTTGA TGTCATTGAT ATGGGGTTAG GATATTTAGT AAATGTGTAT GATGCTTGGA 1380 

AAGTTGAAAA GGTACTTGAT GATTATCATA AGCCTTTTTC TAATACCATT CATTGGCAAT 1440 

TTGGtCATGT ATTAACAATT TTTGAATCGG CCTTAOCTGT TGCTGGTAAA GAGAATATTG 1500 

ATTTAAATAT CTATAGACCT TTATTCGGAA ATGGTTCGTC TCCAGATGAA TGGAAGGATG 1560 

AAGTACCGAG TATTGAAAGG ATTTTAGAAG GTCTCCAAAC TTTACCTGAA CGTGCACGAA 1620 

ATCTAACTGA AGATGATTTA GCAATTGAAT TGAAACAGCC AATTGTCGGT TGTAATAACT 1680 

TAGAAGAGTT ATTAGTATTA AATGCCATTC ACATCCCACT TCATGCTGGT AAAATTGAAG 1740 

AGATGTCTCG TATATTAAAA AATTTAAAAT AAATATGTGC TTATTAACCG TTAACAACAC 1800 

GTTAACGGgT TTTTTATTTG TTTAAAAGGT CACTTTTTTG AATTTAATAA ACACCATCTA 1860 

SO TACCAGTTCT TCACCX5ATTC TCXykAAAATA ATTATATTAA TGATTTCGTT AATTTAATTT 1920 

TATATTTAAT TATTACTGTA CATCTTTTGT AGTTAGCTTT ATTCTTAAAT TGAAATATGT 1980 
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TACTCCCTAT CGTTGTAGGT CTCCTTATTT GGGCACTTAC ACCTTTTAAA CCGGATGCTG 2100 

TGGATCCAAC AGCATGGTAT ATGTTCGCAA TATTCGTCGC GACAATCATT GCTTGTATTA 2160 

^ CACAACCGAT GCCAATTGGG GCCGTCTCTA TAATTGGATT TACAATCATG GTACTCGTTG 2220 

GCATTGTTGA CATGAAAACG GCTGTCGCTG GTTTTGGTAA TAATAGCATT TGGTTAATTG 2280 

CTATGGCATT TTTCATTTCG AGAGGATTTG TGAAAACAGG TCTTGGTAGA CGTATCGCAC 2340 

10 

TTCATTTCGT CAAATTATTT GGTAAAAAAA CATTAGGATT AGCATATTCT ATCGTCGGTG 2400 

TAGATTTAAT TCTAGCGCCT GCTACACCAA GTAATACCGC GCGTGCTGGT GGAATCATGT 2460 

TCCCAATTAT CAAATCACTT TCTGAATCAT TTGGTTC6AA ACCGAAAGAC GGATCAGCAC 2520 

75 

GCAAAATGGG TGCATTTCTT GTTTTCACAG AATTCCAAGG TAATTTAATT ACTGCGGCTA 2 5 BO 

TGTTTTTAAC TGCAATGGCC GGTAACCCCC TTGCACAAAA TTTAGCATCT AGCACATCTA 2640 

20 ATGTTCACAT TACATGGATG AATTGGTTTC TAGCTGCTTT AGTTCCTGGA CTTGTTTCCT 2700 

TAATTGTTGT ACCTTTTATT ATTTATAAAA TTTATCCACC AACTGTTAAA GAAACACCAA 2760 

ATGCTAAGAG TTGGGCTGAA AATGAATTAG CGACTATGGG TAAAATCGCT TTAGCTGAAA 2820 

25 AATTTATGAT TGGTATTTTT GTCGTTGCGT TAACACTATG GATTGTCGGA AGTTTCATTC 2 8 BO 

ATATTGATGC AACTTTAACG GCCTTTATTG CQCTAgcATT gTTATTATTG ACAGGCGTCT 2940 

TAACATGGCA AGACATTTTA AACGAAACAG GTGCTTGGAA CACATTAGTA TGGTTCTCAG 3000 

TATTAGTGTT AATGGCCGAC CT^TTAAACA AGCTTGGATT TATTCCTTGG TTAAGTAAAT 3060 

CCATTGCTAC AAGTCTTGGT GGCTTAAGCT GGCCTATAGT CCTGGTCATT TTAATATTGT 3120 

TCTACTTCTA TTCACATTAC TTATTTGCAA GTTCTACAGC ACATATCAGT GCGATGTATG 3180 

35 

CAGCATTACT AGgCGTTGCC ATCGCAGCCG GTGCACCACC ATTATTCAGT GCATTAATGT 3240 

TAGGTTTCTT CGGTAACCTA TTAGCTTCAA CAACACACTA TAGTAGTGGT CCAGCGCCGA 3300 

TTCTATTCTC TTCAGGTTAC GTGACTCAAA AACGTTGGTG GACAATGAAC TTAATATTAG 3360 

40 

GTTTCGTCTA CTTTATTATC TGGATT6GTT TAGGATCACT TTGGATGAAA GTAATTGGTA 3420 

TATTTTAAAA TATTTAAATT AGCGCTCGAA TCTCATTGAT TTGGGCGCTT TTTAATTTGT 3480 

45 ATTTAAAATC AACCTTTGCT AAATCAAGAC TCCCTTTTTA AAATACGTTT ATCCTTTAAA 3540 

TCATTGCGTG CTTCACTGAA AATTTGTATA AAGATTTAAG TCATTACGTA ACATCACATA 3600 

AAATACATTT CTATACTATT CCGCTTCATT GATTAACATT ACGTATGCCC TCATAAATCA 3660 

so TCATACAAAA AACACCTTCG TTTAAATTCA TTTTAATTGC GAATTCAACG AAAGTGCCTT 3720 

ATTTCATATT TAATGTTTCA AATTTATACG TCTGTCACTG TTACT6CACA CATACCTCAG 3780 
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TTATAGGGTT TTTGCGACCG 


GATGTTTCTT 


CAATTTAATG TATTGAGAAA GACTATATAA 


3900 


CACAATACCT GTCCAAATAA 


ATATAAACGT 


AATTAATTGA TCTATACTAA AAGGCTCTTT 


3960 


GAAAACAAAT ATGCCGAGTA 


CAAACATTAT 


TGTTGGTCCA ACGTATTGAA TAAATCCTAT 


4020 


TAGCGAAAGT GGAATACGTT 


TTGCCCCGGC 


TGAGAATAGG ATTAGTGGTA TTGCCGTAAT 


40B0 


AGCACCAGAA AATAACAACC 


AAAATGATGA 


CATGTTCAAT CCAAATGACA TCTGATGTTG 


4140 


CTGCCATAAA TAAATAACGT 


ATATTAGTCC 


AGCAGGTGCG GTAACAATAC ATTCAATCGT 


4200 


AATACTGCTG ATGGCATCAA 


TATGTACTAC 


TTTTTTCAAT AATCCGTATG TACCAAAGGA 


4260 


TAACGCTAAT ATAATAGAGA 


CGATTGGGAA 


TTCTCCAATC TTGAGCGTCA TATATAATAC 


4320 


ACCGATGAAT 6CGAATAAAA 


TGGCTAGCCA 


TTCAAATTTA TTGAATCTTT CTTTTAAAAA 


4380 


GATAAGTGCG AGCAAAATGC 


TAACAAGTGG 


ATTTATATAA TAACCTAAAC TTGTTTGTAG 


.4440 


GACGTOACCG TTCGTTACAG 


CCCAAATAAA 


T6TACCCCAA TTTAATGTAA TGACATAGCC 


4500 


T6CTACGACA ATCGCTAATA 


GCTGAATGGG 


CTTGCCTAAC AATTGATTCA TATCTOGTTG 


4560 


AAATGCATTG CGTTGTTTTT 


GTCCAACCGC 


GAGTATGAAA ATCATGAATA TTGCTGAAAA 


4620 


TATAATACGA AAGGCTAAAA 


TTTCAAATGC 


GCCTATTGCA TCAAC6AACT GCCAATATAT 


4680 


AGGTAGTATT CCCCACAGAA 


TGTATGCACT 


GAGTGCTAAA AATATGCCTT TTTTATACTC 


4740 


TGAATTCACC TTCAAACCTC 


CTTACTTTCC 


TAATTTTTAA TTTACTGCAT ACGCTCACTT 


4800 


GGTTATGCTA ATATAACGAT 


TTTACTAATA 


ATATTTCGAT AAAGATATCA TTTTGTTTAT 


4860 


ATTTCCCACA TTTATTCACC 


AACCACTAAA 


CAATATTAAT TTTATAAATA ATTCTGTACA 


4920 


AATCAGGGTA TATTGCCAGA 


AAGACTACCA 


TACAACATAA AGGATGGATA CAAATGACTT 


4980 


TACCTAAAAT TGGAAAGCCT 


GCAACACGCG 


CGCTAAATTC ACAAGGTATA TACACATTAG 


5040 


AAGGAGTATC ACAATATACG 


AAGTCATCTC 


TAATGGAGAT GCATGGCGTT GGTCCTAAAG 


5100 


CTAT ATCAAT ATTGGAACAA 


GCTTTATTTC 


AG 


5132 



40 

(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22243 base pairs 
^ (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



so (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

AAGTAAATTA TATTATGAAT TTGCCTGTCA ATTTCTTAAA GACATTCTTA CCGGAACTAA 60 
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TAGAAGCAAT TAATAATGCy mAAGAAAAGA CAGCTAATAA TACCGGCTTA AAATTAATAT 180 

TTGCAATTAA TTATGGTGGC AGAGCAGAAC TTGTTCATAG TATTAAAAAT ATGTTTGACG 240 

AGCTTCATCA ACAAGGTTTA AATAGTGATA TCATAGATGA AACATATATA AACAATCATT 300 

TAATGACAAA AGACTATCCT GATCCAGAGT TGTTAATTCG TACTTCAGGA GAACAAAGAA 360 

TAAGTAATTT CTTGATTTGG CAAGTTTCGT ATAGTGAATT TATCTTTAAT CAAAAATTAT 420 

GGCCTGACTT TGACGAAGAT GAATTAATTA AATGTATAAA AATTTATCAG TCACGTCAAA 480 

GACGCTTTGG CGGATTGAGT GAGGAGTAGT ATAGTATGAA AGTTAGAACG CTGACAGCTA 540 

TTATTGCCTT AATCGTATTC TTGCCTATCT TGTTAAAAGG CGGCCTTGTG TTAATGATAT 600 

TTGCTAATAT ATTAGCATTG ATTGCATTAA AAGAATTGTT GAATATGAAT ATGATTAAAT 660 

TTGTTTCAGT TCCTGGTTTA ATTAGTGCAG TTGGTCTTAT CATCATTATG TTGCCACAAC 720 

20 ATGCAGGGCC ATGGGTACAA GTAATTCAAT TAAAAAGTTT AATTGCAATG AGCTTTATTG 780 

TATTAAGTTA TACTGTCTTA TCTAAAAACA GATTTAGTTT TATGGATGCT GCATTTTGCT 840 

TAATGTCTGT GGCTTATGTA GGCATTGGTT TTATGTTCTT TTATGAAACG AGATCAGAAG 900 

2S GATTACATTA CATATTATAT GCCTTTTTAA TTGTTTGGCT TACA6ATACA GGGGCTTACT 960 

TGTTTGGTAA AATGATGGGT AAACATAAGC TTTGGCCAGT AATAAGTCCG AATAAAACAA 1020 

TCGAAGGATT CATAGGTGGC TTGTTCTGTA GTTTGATAGT ACCACTTGCA ATGTTATATT 1080 

TTGTAGATTT CAATATGAAT GTATGGATAT TACTTGGAGT GACATTGATT TTAAGTTTAT 1140 

TTGGTCAATT AGGTGATTTA GTGGAATCAG GATTTAAGCG TCATTTCGGC GTTAAAGACT 1200 

CAGGTCX;AAT ACTACCTGGA CACGGTGGTA TTTTAGACCG ATTTGACAGC TTTATGnTG 1260 

TGTTACCATT ATTAAATATT TTATTAATAC AATCTTAATG CTGAGAACAA ATCAATAAAC 1320 

GTA/AGAGGA GTTGCTGAGA TAATITAATG AATCTCAGAA CTCCTTTTGA AAATTATACG 1380 

CAATATTAAC TTTGAAAATT ATACGCAATA TTAACTTTGA AAATTAGACG TTATATTTTG 1440 

TGATTTGTCA 6TATCATATT ATAATGACTT ATGTTACGTA TACAGCAATC ATTTTTAAAA 1500 

TAAAAGAAAT TTATAAACAA TCGAGGTGTA GCGAGTGAGC TATTTAGTTA CAATAATTGC 1560 

4S ATTTATTATT GTTTTTGGTG TACTAGTAAC TGTTCATGAA TATGGCCATA TGTTTTTTGC 1620 

GAAAAGAGCA GGCATTATGT GTCCAGAATT TGCXyVTCGGT ATGGGGCCAA AAATTTTTAG 1680 

TTTTAGAAAA AATGAAACAC TTTACACTAT TAGGTTATTG CCTGTTGGTG GATATGTTCG 1740 

TATGGCAGGA GATGGCTTAG AAGAGCCACC AGTCGAGCCC GGTATGAACG TTAAAATTAA 1800 

ACTTAATGAA GAAAATGAAA TAACACATAT CATATTAGAT GATCATCATA AGTTTCAACA 1860 
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CACTGCTTAT GATAATGAAA GACATCATTT TAAAATTGCT AGAAAGTCTT TCTTTGTTGA 1980 

AAATGGTAGC TTAGTTCAAA TTGCTCCGAG AGACAGACAA TTTGCACATA AAAAGCCATG 2040 

GCCGAAATTT TTAACATTAT TTGCGGGACC GTTATTTAAC TTTATATTAG CTTTAGTCCT 2100 

ATTTATTGGT CTTGCATATT ATCaAGGcAC GCcTACGTCT ACTGTAGAAC AAGTCGCAGA 2160 

TAAGTATCCA GCTCAACAAG CAGGATTACA AAAAGGTGAT AAGATCGTCC AAATTGGCAA 2220 

ATATAAAATA TCTGAATTTG ATGATGTTGA TAAGGCGTTA GATAAAGTTA AAGATAATAA 2280 

GACGACTGTT AAATTTGAAC GTGATGGTAA AACAAAGTCA GTTGAATTAA CACCTAAAAA 2340 

GACT6AAAAA AAACTGACTA AAGTAAGTTC AGAGACGAA6 TATGTTCTCG GATTCCAACC 2400 

AGCGAGTGAA CATACACTTT TTAAACCAAT TGTATTCGGA TTTAAAAGCT TTTTAATCGG 2460 

TAGTACTTAT ATTTTTACAG CTGTAGTAGG TATGTTGGCT AGTATATTTA CGGGCGGATT 2520 

20 CTCATTTGAT ATGTTAAATG GTCCGGTTGG TATTTATCAT AACGTCGACT CAGTTGTTAA 2580 

AGCGGGTATC ATTAGCTTAA TTGGTtnCAC TGCGTTATTA AGTGTAAACT TAGGTATTAT 2640 

GAATTTAATT CCTATTCCTG CACTAGACGG TGGTCGTATT TTATTTGTTA TATATGAAGC 2700 

2S GATTTTCAGA AAACCAGTTA ATAAAAAAGC GGAAACAACG ATTATTGCTA TTGGTGCCAT 2760 

TTTCATGGTC GTTATAATGA TATTAGTAAC GTGGAATGAT ATTCGACGAT ATTTCTTATA 2820 

ATTTAGGAGG ATAAATAATT ATGAAGCAAT CCAAAGTTTT TATACCAACG ATGCGTGACG 2880 

TGCCATCAGA AGCAGAAGCA CAAAGTCATC GTTTATTATT GAAATCGGGT TTGATAAAAC 2940 

AAAGTACAAG TGGGATTTAT AGTTATTTAC CGCTAGCAAC ACGTGTGTTA AATAATATTA 3000 

CTGCAATTGT GCGACAAGAA ATGGAACGTA TCGATTCTGT TGAAATTTTA ATGCCAGCGT 3060 

TACAACAAGC TGAATTATGG GAAGAATCAG GACGTTGGGG TGCATATGGC CCAGAATTAA 3120 

TGCGTTTACA AGATAGaCAT GGAAgACAAT TTgCATTAGG TCCaACACAT GAAGAATTAG 3180 

TTACATCAAT AGTAAGAAAT GAATTGAAAT CATACAAACA ATTACCGATG ACATTATTCC 3240 

aAATTCAATC TAAATTCCGT GATGAAAAGA GACCACGTTT TGGTTTAyTC GTGGGCGTGA 3300 

ATTTATTATG AAAGATGCAT ATTCATTCCA TGCTGACGkG GCATCATTAG ATCAAACGTA 3360 

45 TCAAGATATG TATCAAGCGT ATAGCCGTAT TTTT6AGAGA GTTGGCATTA ACGCAAGACC 3420 

AGTAGTTGCA GATTGAGGTG CTATAGGCGG TAGCCATaCA CATGAATTTA TGGCATTAAG 3480 

TGCTATCGGT GAGGATACAA TCGTTTACAG TAAAGAAAGT GATTATGCTG CTAACATCGA 3540 

50 AAAAGCAGAA GTCGTTTACG ArcCAaATcA TaAGCATACT ACTGTGCAAC CTTTAGAAAA 3600 

AATTGAAACA CCAAATGTTA AGACTGCGCA AGAATTGGCA GACTTCTTAG GTAGACCAGT 3660 
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GCGTGGCCAT CATGAAATTA ATGACATTAA ATTAAAATCT TATTTCGGCA CAGATAATAT 3780 

TGAATTAGCA ACACAAGACG AAATTGTTAA TTTAGTTGGT GCAAATCCTG GTTCACTAGO 3840 

5 TCCTGTAATT GATAAAGAAA TCAAAATTTA TGCAGATAAT TTTGTGCAAG ATTTAAATAA 3900 

TTTAGTTGTC GGTGCTAACG AAGATGGTTA TCACTTAATT AATGTAAATG TAGGTAGAGA 3960 

CTTCAACGTT GATGAATATG GCGATTTCCG TTTTATTTTA GAAGGCGAAA AGTTAAGTGA 4020 

10 

TGGTTCAGGC GTTGCACATT TTGCTGAAGG TATTGAAGTT GGTCAAGTAT TCAAATTGGG 4080 

TACTAAGTAT TCAGAATCAA TGAATGCTAC ATTCTTAGAT AACCAAGGAA AAGCTCAATC 4140 

TTTAATTATG GGTTGTTACG GAATTGGAAT TTCTAGAACG CTAAGTGCGA TTGTTGAACA 4200 

75 

AAATCACGAT GATAATGGAA TTGTTTGGCC TAAATCAGTT ACTCCGTTTG ATTTACATTT 4260 

AATTTCTATT AATCCTAAGA AAGATGATCA ACGAGAACTA GCAGATGCAC TATATGCTGA 4320 

ATTTAATACT AAATTTGATG TGTTGTACGA TGATCGTCAG GAACGTGCAG GTGTTAAATT 4380 

20 

TAATGATGCC GATTTAATTG GTTTACCACT GCGAATTGTT GTTGGTAAAC GT6CATCGGA 4440 

AGGTATTGTA GAAGTTAAAG AACGTTTAAC AGGTGATAGC 6AAGAAGTTC ACATTGATGA 4500 

2s CTTAATGACT GTCATTACAA ATAAATATGA TAACTTAAAA TAATTAAGAT CGAATGAATT 4560 

ATAAGAGTAG GAAAAAGCTG AAAGAAATCT GATGCTTATG TCCTGCTCTT ATTATTTTTG 4620 

ATATAATGAT TATTCGATGA AAAATGACTG AAGACATAGT ATAATTAAAG ATAAATTTGT 4680 

30 TTTAACAATA TAATGATTAG CCAAATATAA AGCATTTAAT TTTCTATCAT TACTATGCTC 4740 

ACATAATCTA AATATTGTTC GAACACGTAA AAGTAATTTC TATTTAAGGT GGTAATTGTC 4800 

TTGGCAATGA CAGAGCAACA AAAATTTAAA GTGCTTGCTG ATCAAATTAA AATTTCAAAT 4860 

^ CAATTAGATG CTGAAATTTT AAATTCAGGT GAACTGACAC GTATAGATGT TTCTAACAAA 4920 

AACAGAACAT GGGAATTTCA TATTACATTA CCACAATTCT TAGCTCATGA AGATTATTTA 4 980 

TTATTTATAA ATGCAATAGA GCAAGAGTTT AAAGATATCG CCAACGTTAC ATGTCGTTTT 504 0 

40 

ACGGTAACAA ATGGCACGAA TCAAGATGAA CATGCAATTA AATACTTTGG GCACTGTATT 5100 

GACCAAACAG CTTTATCTCC AAAAGTTAAA GGTCAATTGA AACAGAAAAA GCTTATTATG 5160 

TCTGGAAAAG TATTAAAAGT AATGGTATCA AATGACATTG AACGTAATCA TTTTGATAAG 5220 

45 

GCATGTAATG GAAGTCTTAT CAAAGCGTTT AGAAATTGTG GTTTTGATAT CGATAAAATC 5280 

ATATTCGAAA CAAATGATAA TGATCAAGAA CAAAACTTAG CTTCTTTAGA AgCACaTATT 5340 

CAAGAAGAAG ACGAACAAAG TGCACGATTG GCAACAGAGA AACTTGAAAA AATGAAAGCT 5400 

GAAAAAGCGA AACAACAAGA TAACAACGAA AGTGCTGTCG ATAASTGTCA AATTGGTAAG 5460 
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GCAATAGAGG 


GTGTCATTTT TGATATAAAC 


TTAAAAGAAC TTAAAAGTGG 


TOGCCATATC 


5580 




GTAGAAATTA AAGTGACTGA CTATACGGAC TCTTTAGTTT TAAAAATGTT 


TACTCGTAAA 


5640 


5 


AAC/WUVQATG ATTTAGAACA TTTTAAAGCG CTAAGTGTTG GTAAATGGGT TAGGGCTCAA 


5700 




GGTCGTATTG AAQAAGATAC ATTTATTAGA GATTTAGTTA TGATGATGTC TGATATTGAA 


5760 


10 


GAGATTAAAA 


AAGCGACAAA AAAAGATAAG 


GCTGAAGAAA AGCGTGTAGA 


ATTCCACTTG 


5820 


CATACTGCAA 


TGAGCCAAAT GGATGGTATA 


CCCAATATTG GTGCGTATGT 


TAAACAGGCA 


5880 




GCAGACTGGG 


GACATCCAGC CATTGCGGTT 


ACAGACCATA ATGTTGTGCA AGCATTTCCA 


5940 


IS 


GATGCTCACG CAGCAGCGGA AAAACATGGC ATTAAAATGA TATACGGTAT GGAAOGTATO 


6000 




TTAGTTGATG 


ATG6T0TTCC QATTGCATAC 


AAACCACAAG ATGTCX5TATT 


AAAAGATGCT 


6060 




ACTTATGTTG TCrTCGACGT TGAGACAACT GGTTTATCAA ATCAGTATGA TAAAATCATC 


6120 


20 


GAGCTTGCAG 


CTGTGAAAGT TCATAACGGT 


GAAATCATCG ATAAGTTTGA AAGGTTTAGT 


6180 




AATCCGCATG AACGATTATC GGAAACGATT 


ATCAATTTGA CGCATATTAC 


TGATGATATG 


6240 




TTAGTAGATG 


CCCCTGAGAT TGAAGAAGTA 


CTTACAQAGT TTAAAGAATG 


OGTTGGCGAT 


6300 


25 


GCGATATTCG 


TAGCGCATAA TGCTTCGTTT 


6ATATGGGCT TCATCGATAC 


GGGATATGAA 


6360 




CGTCTTGGGT TTGGACCATC AACGAATGGT GTTATCGATA CTTTAGAATT ATCTCGTACG 


6420 




ATTAATACTG 


AATATGGTTU^ ACATGGTTTG 


AATTTCTTGG CTAAAAAATA 


TGGCGTAGAA 


6480 


30 


TTAACGCAAC 


ATCACCGTGC CATTTATGAT 


ACAGAAGCAA CAGCTTACAT 


TTTCATAAAA 


6540 




ATGGTTCAAC 


AAATGAAAGA ATTAGGCGTA 


TTAAATCATA ACGAAATCAA 


CAAAAAACTC 


6600 




AGTAATGAAG 


ATGCATATAA ACGTGCAAGA 


CCTAGTCATG TCACATTAAT 


TGTACAAAAC 


6660 


35 


CAACAAGGTC 


TTAAAAATCT ATTTAAAATT 


GTAAGTGCAT CATTGGTGAA 


GTATTTCTAC 


6720 




CGTAGACCTC 


GAATTCCACG TTCATTGTTA 


GATGAATATC GTGAGGGATT 


ATTGGTAGGT 


6780 


40 


ACAGCGTGTG 


ATGAAGGTGA ATTATTTACG 


GCAGTTATGC AGAAGGACCA 


GAGTCAAGTT 


6840 


GAAAAAATTG 


CCAAATATTA TGATTTTATT 


GAAATTCAAC CACCX3GCACT 


TTATCAAGAT 


6900 




TTAATTGATA 


GAGAGCTTAT TAGAGATACT 


GAAACATTAC ATGAAATTTA 


TCAACGTTTA 


6960 


45 


ATACATGCAG 


GTGACACAGC GGGTATACCT 


GTTATTGCGA CAGGAAATGC 


ACACTATTTG 


7020 




TTTGAACATG ATGGTATCGC ACGTAAAATT 


TTAATAGCAT CACAACCCGG 


CAATCCACTT 


7080 




AATCGCTCAA 


CTTTACCGGA AGCACATTTT 


AGAACTACAG ATGAAATGTT 


AAACGAGTTT 


7140 


SO 


CATTTTTTAG 


GTGAAGAAAA AGCGCATGAA ATTGTTGTGA AAAATACAAA 


CGAATTAGCA 


7200 




GATCGAATTG 


AACGTGTTGT TCCTATTAAA GATGAATTAT ACACACOGCG 


TATGGAAGGT 


7260 
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CTGCCTCAAA TCGTAATTGA TCGATTAGAA AAAGAATTAA AAAGTATTAT CGGTAATGGA 73 BO 

TTTGCX3GTAA TTTACTTAAT TTCGCAACGT TTAGTTAAAA AATCATTAGA TGATGGATAC 744 0 

TTAGTTGGTT CCCGTGGTTC AGTAGGTTCT AGTTTTGTAG CGACAATGAC TGAGATTACT 7S00 

GAAGTAAACC CGTTACCGCC ACACTATATT TGTCCGAACT GTAAAACGAG TGAATTTTTC 7560 

AATGATGGTT CAGTAGGATC AGGATTTGAT TTACCTGATA AGACGTCTGA AACTTGTGGA 7620 

GCGCCACTTA TTAAAGAAGG ACAAGATATT CCGTTTGAAA CATTTTTAGG ATTTAAGGGA 7680 

GATAAAGTTC CTGATATCGA CTTAAACTTT AGTGGTGAAT ATCAACCGAA TGCCCATAAC 7740 

TACACAAAAG TATTATTTGG TQAGGATAAA GTATTC06T0 CAGGTACAAT TGGTACTGTT 7800 

GCTGAAAAGA CTGCTTTTGG TTATGTTAAA GGTTATTTGA ATGATCAAGG TATCCACAAA 7860 

AGAGGTGCTG AAATAGATCG ACTCGTTAAA GGATGTACAG GTGTTAAACG TACAACTGGA 7920 

CAGCATCCAG GGGGTATTAT TGTAGTACCT GATTACATGG ATATTTATGA TTTTACGCCX3 7980 

ATACAATATC CTGCCGATGA TCAAAATTCA GCATGGATGA CGACACATTT TGATTTCCAT 8040 

TCTATTCATG ATAATGTATT AAAACTTGAT ATACTTGGAC ACGATGATCC AACAATGATT 8100 

2S CGTATGCTTC AAGATTTATC AGGAATTGAT CCAAAAACAA TACCTGTAGA TGATAAAGAA 8160 

GTTATGCAGA TATTTAGTAC ACCTGAAAGT TTGGGTGTTA CTGAAGATGA AATTTTATGT 8220 

AAAACAGGTA CATTTGGGGT ACCAGAATTC GGTACAGGAT TCGTGCGTCA AATGTTAGAA 8280 

GATACAAAGC CAACAACATT TTCTGAATTA GTTCAAATCT CAGGATTATC TCATGGTACA 8340 

GATGTGTGGT TAGGCAATGC TCAAGAATTA ATTAAAACCG GTATATGTGA TTTATCAAGT 8400 

GTAATTGGTT GTCGTGATGA TATCATGGTT TATTTAATGT ATGCTGGTTT AGAACCATCA 8460 

ATGGCTTTTA AAATAATGGA GTCAGTACGT AAAGGTAAAG GTTTAACTGA AGAAATGATT 8520 

GAAAeGATGA AAGAAAATGA AGTGCCAGAT TGGTATTTAG ATTCATGTCT TAAAATTAAG 8580 

TACATGTTCC CTAAAGCCCA TGCAGCAGCA TACGTTTTAA TGGCAGTACG TATCGCATAT 8640 

TTCAAAGTAC ATCATCCACT TTATTACTAT GCATCTTACT TTACAATTCG TGCGTCAGAC 8700 

TTTGATTTAA TCACGATGAT TAAAGATAAA ACAAGCATTC GAAATACTGT AAAAGACATG 8760 

TATTCTCGCT ATATGGATCT AGGTAAAAAA GAAAAAGACX5 TATTAACAGT CTTGGAAATT 8820 

ATGAATGAAA TGGCGCATCG AGGTTATCGA ATGCAACCX5A TTAGTTTAGA AAAGAGTCAG 8880 

GCGTTCGAAT TTATCATTGA AGGCGATACA CTTATTCC6C CGTTCATATC AGTGCCTGGG 8940 

so CTTGGCGAAA ACGTTGCGAA ACGAATTGTT GAAGCTCGTG ACGATGGCCC ATTTTTATCA 9000 

AAAGAAGATT TAAACAAAAA AGCTGGATTA TCTCAGAAAA TTATTGAGTA TTTAGATGAG 9060 
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GAAATAATCA AGGTATTTAT TTAATGCGTA TGGCGTAGTC AAAGAAATAC AAAATTGTTG 9180 

CTGGACACAA AATTATGCCC GTATTTCTTT TCAATGTCTT ACGAGTCTAT TCAAATGTAA 9240 

^ TGGTGAAATA AAGGAACAAA CTTTTACAAG AATCTCTGAT TAATAGTGAA GTCATTTGTT 9300 

TCAAGCATAA ACTTATGCTA TAATTAAGTT GCTTAAAAAT TAGTGAACTC AGGCAGAAGA 9360 

GTGGGAGATT CCCGCTCTTT TCTATTTGCC AAAAAGGGAG GCCTGTATGA GTAAAATTAC 9420 

10 

AGAACAAGTA GAAGTGATTG TTAAACCAAT TATGGAAGAC TTGAATTTTG AACTTGTAGA 9480 

CGTTGAATAT 6TCAAAGAGG GTAGAGATCA TTTTCTTAGA ATCTCTATTG ATAAAGAAGG 9540 

TGGCGTA6AT TTAAATGATT GTACGCTA6C TTCTGAAAAA ATAAGTGAAG CTATGGATCC 9600 

IS 

AAATGATCCT ATTCCTGAAA TGTATTATTT AGACGTAGCG TCACCTGGTG CAGAACGTCC 9660 

AATTAAAAAA GAACAAGATT TCCAAAATGC AATAACTAAA CCTGTATTTG TTTCTTTATA 9720 

20 TGTACCAATT GAAGGTGAAA AGGAATGGTT AGGCATTTTA CAAGAAGTCA ATAATGAAAC 9780 

AATTGTAGTA CAAGTTAAAA TCAAAGCAAG AACGAAAGAT ATAGAGATAC CGAGAGACAA 9840 

AATAGCAAAA GCACGTCACG CAGTTATGAT TTAACGTGAT GAGGAGGAAA AAACGTGTCA 9900 

25 AGTAATGAAT TATTATTAGC TACTGAGTAT TTAGAAAAAG AAAAGAAGAT TCCTAGAGCA 9960 

GTATTAATTG ATGCTATTGA AGCAGCTTTA ATTACTGCAT ACAAAAAGAA TTATGATAGT 10020 

GCAAGAAATG TCCGTGTGGA ATTAAATATG GATCAAGGTA CTTTCAAAGT TATCGCTCGT 10080 

30 AAAGATGTTG TTGZ^GAAGT ATTTGACGAC AGAGATGAAG TGQATTTAAG TACAGCGCTT 10140 

GTTTU^AAACC CTGCATATGA AATTGGTGAT ATATACGAAG AAGATGTAAC ACCTAAAGAT 10200 

TTTGGTCGTG TAGGTGCTCA AGCAGCGAAA CAAGCAGTAA TGCAACGTCT TCGTGATGCT 10260 

^ GAACGTGAAA TTTTATTTGA AGAATTTATA GACAAAGAAG AAGACATACT TACTGGAATT 10320 

ATTGACCGTG TTGACCATCG TTATGTATAT GTGAATTTAG GTCGTATCGA AGCTGTTTTA 10380 

TCTGAAGCAG AAAGAAGTCC TAACX5AAAAA TATATTCCTA ACGAACGTAT CAAAGTATAT 10440 

40 

GTTAACAAAG TGGAACAAAC GACAAAAGGT CCTCAAATCT ATGTTTCTCG TAGCCATCCA 10500 

GGTTTATTAA AACGTTTATT TGAACAAGAA GTTCCAGAAA TTTAOGATGG TACTGTAATT 10560 

GTTAAATCAG TAGCACGTGA AGCTGGCGAT CGCTCTAAAA TTAGTGTCTT CTCTGAAAAC 10620 

45 

AATGATATAG ATGCTGTTGG TGCATGTGTT GGTGCTAAAG GCGCACGTGT TGAAGCTGTT 10680 

GTTGAAGAGC TAGGTGGTGA AAAAATCGAC ATCGTTCAAT GGAATGAAGA TCCAAAAGTA 10740 

SO TTTGTAAAAA ATGCTTTAAG CCCTTCTCAA GTTTTAGAAG TTATTGTTGA TGAAACAAAT 10800 

CAATCTACAG TAGTTGTTGT TCCTGATTAT CAATTGTCAT TAGCGATTGG TAAAAGAGGA 10860 
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GATGCGCGTG AAGCGGGTAT CTATCCAGTA 
GTTGCTTTAG AAGATGCTGA CACAACAGAA 
^ GAAACAAATG TAGAGAAAGA ATCTGAATAA 

AAAAAATTCC GATGCGAAAA TGTATTCTTT 
TTCGTGTTGT TGTTAATAAA GAAGGCGAAA 

10 

GCCGTGGCXSC ATATGTTTCT AAAGATGTTG 
TTTTAGAAAA ATATTTTAAA GCATCTAAAG 
TTAGATTAAT TTATAGAGAA GAGATCCCAA 

75 

AGGATTAGCA ATGAGAGCTG GTAAAGTAAA 
TAAAAAAGGA AATTTGAAGC TCGTTATTGT 
ATTAATTACA GATAAATGTA AGAGTTACAA 

20 

TGAATTGGGA ATAGCACTTG GAAAAOGTGA 
CTTTGCTAAA AAGTTGCTAT CAATGATAGA 

25 AACAAAGAAT TTACGAATAT GCGAAAGAAT 
AGTTAAAAAG CATGAATATT GAGGTTTCAA 
TTAAAGCATT AGATAAAAAG TTCAAAAAAG 

30 AAAATAATCA CCAAAAATCA AACAATCAAA 

AAAAGAATCm ACAACAAAAT AATAAAGGCA 
ATAAGAAAAA TAACAAGAAT AATAAACCAC 
CATCAAAAGT GACATATCAA GAAGGTATTA 
TTGAATCATC AGAAATTATC AAAAAATTAT 
AATCATTAAA TCAAGAAACA ATCGAATTAA 

40 

AAGAAGTTGT GATTAATGAA GAAGACTTAT 
CAGAGGCAAT TGAGAGACCA GCAGTTGTAA 
CGACTTTATT AGATTCAATT CGTCATACAA 

45 

CTCAACATAT TGGTGCATAT CAAATTGAAA 
CACCGGGACA TGCTGCATTT ACAACGATGC 
CTATTTTAGT AGTAGCAGCT GACGATGGTG 
ATGCTAAAGA AGCAgAAGTA CCAATTATTG 



GTTGAAGCTG AAAAAGTAAC TGAAGAAGAT 10980 

TCAACCX3AAG AGGTAAATGA TGTTTCAGTT 11040 

TAGGTTGGAG TGAAGTATCT ATGAAAAAGA 11100 

CAAATGAAAT GCATCCCAAA AAAGATATGA 1116 0 

TCTTTGCGGA TGTTACTGGA AAGAAACAAG 11220 

CTATGGTTGA AAAAGCACAA CAAAAAGAAA 11280 

AGCAATTGGA TCCTGTTTAC AAAGAAATTA 1134 0 

AATGAGTATA GATCAAATAT TAAACTTTTT 11400 

AACAGGTGAA TCAGTCATTG TTAATGAGAT 11460 

TGCAAATGAT GCGTCTGATA ATACAGCTAA 11520 

AGTTCCATTC AGAAAGTTTG GAAATCGAAA 11580 

GCGTGTTAAT GTAGGGATTA CTGACCCAGG 11640 

TGAATATCAT AAGGAGTGAT TATATGAGTA 11700 

TAAATCTAAA GAGTAAAGAG ATTATAGATG 11760 

ATCATATGCA AGCTTTGGAA GATGACCAAA 11820 

AACAAAAGAA CGACAATAAA CAAAGCACTC 11880 

ACCAAAATAA AGGGCmACAA AAAGATAACA 11940 

ACAAAGGCAA TAAAAAGAAT AATAGAAATa 12000 

AAAATCAACC AGCTGCTCCA AAAGAAATAC 12060 

CAGTAGGCGA ATTTGCGGAT AAATTAAATG 12120 

TCTTACTTGG TATTGTTGCT AATATCAATC 12180 

TTGCCGATGA TTATGGCGTT GAGGTTGAAG 12240 

CAATCTATTT CGAAGACGAA AAAGATGATC 12300 

CAATTATGGG ACATGTTGAC CATGGTAAAA 12360 

AAGTTACAGC AGGTGAAGCA GGCGGAATCA 12420 

ACGATGGCAA AAAAATCACT TTCTTAGATA 12480 

GTGCGCGTGG TGCaCAAGTA ACAGATATTA 12540 

TTATGCCACA AACAATTGAA GCAATTAACC 12600 

TTGCAGTAAA TAAAATTGAT AAACCAACTT 12660 
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GCGG?CGAAAC AATTtTCGTc CACTTTCTGC ATTAAGTGGT GATGGTATCG ACGATTTATT 12780 

AGAAATGATA GGATTAGTTG CAGAAGTTCA AGAACTTAAA GCAAATCCTA AAAACCGTGC 12840 

^ TGTTGGTACA GTTATCGAAG CTGAATTAGA TAAATCACGT GGTCCTTCTG CATCATTATT 12900 

AGTACAAAAC GGTACATTAA ATGTTGGTGA TGCGATTGTA GTTGGTAATA CTTACGGCCG 12960 

TATCCGTGCA ATGGTTAATG ACTTAGGTCA AAGAATCAAA ACGGCTGGTC CATCAACX3CC 13020 

10 

TGTTGAAATT ACAGGTATTA ATGATGTGCC ACAAGCTGGG GATCGCTTTG TTGTATTTAG 13080 

TGATGAAAAA CAAGCTCGTC GTATTGGTX3A ATCAA6ACAC GAAGCTAGCA TTATACAACA 13140 

AC6TCAAGAA AGTAAAAATG TTTCATTAGA TAACCTGTTT GAACAAATGA AACAAGGTGA 13200 

AATGAAAGAT TTAAACGTTA TTATTAAAGG TGATGTTCAA GGTTCTGTTG AAGCTTTAGC 13260 

TGCATCATTA ATGAAAATTG ATGTTGAAGG CGTAAATGTT CGTATCATTC ATACAGCGGT 13320 

2^ TGGTGGAATT AATGAGTCAG ACGTGACACT T6CTAATGCC TCAAATGGTA TTATCATTGG 13380 

TTTCAATGTT CGTCCAGACA GTGGTGCAAA ACGTGCTGCA GAAGCTGAAA ATGTTGATAT 13440 

GCGTTTACAC AGAGTTATTT ATAATGTTAT CGAAGAAATT GAATCAGCGA TGAAAGGTTT 13500 

25 ACTTGATCCA GAATTTGAAG AACAAGTTAT CGGACAAGCT GAAGTTCGTC AAACATTCAA 13560 

AGTTTCTAAA GTTGGTACTA TTGCTGGATG TTATGTTACT GAAGGTAAAA TTACGCGAAA 13620 

TGCTGGTGTA CGTATTATTC GTGATGGTAT TGTTCAATAT GAAGGCGAAT TAGATACACT 13680 

30 TAAACGTTTC AAAGATGATG CTAAGGAAGT TGCAAAAGGT TATGAATGTG GTATTACAAy 1374 0 

TGAAAACTAC AATGACCTTA AAGAAGGCGA TGTTATCGAA GCATTTGAAA TGGTTGAAAT 13800 

TAAGCGTTAA TTAAATAAAT TACAAGCTAA AAGTATAGTT AAGATTGATA TGCTCCCTAT 13 860 

^ AAATATTGCA CTTTTTAAGT GTCTACTTTA TAGGGAGCAT ATTTGATACT AGCTTTTGGT 13 920 

TTTPTATTAG AATAGATTAC CTATTAAAAG TTACGTTATA TGGACATGAT TTTGTATAAA 13980 

ATTTTGTGGT GGCCTAGAAT GATTTTTAAT GACAAAATAT AATGTCGACT ATTATTGGAA 14040 

40 

AATTTTCTGT TGaAATGCCT ATCTTACGGC AAACTTTATT TGATTTTATA GGCTTAATTT 14100 

ATTAAAATAA CGTGTGAGCT AAAATAATTG TTTAAGCATT GTTACACTAA AAAATGCAAA 14160 

TAACAATTGA ACTTAAAGAT AAAGAGGTGA CAAGAATGAG CAGTATGAGA GCAGAGCX3TG 14220 

45 

TTGGTGAACA AATGAAGAAG GAATTAATGG ATATCATCAA CAATAAAGTC AAAGATCCTC 14280 

GAGTTGGTTT TATTACAATT ACAGATGTTG TTTTAACAAA TGATTTATCG CAGGCTAAAG 14340 

50 TATTTTTAAC TGTATTAGGT AACX3ATAAAG AAGTAGAAAA TACATTTAAA GCACTTGATA 14400 

AAGCAAAAGG CTTCATTAAG TCTGAATTAG GTTCTAGAAT GCGATTACGT ATTATGCCGG 14460 
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AAGATTTACA CAAACAAGAT AGATAATTTA 
TTCTTAATAT CGGTATATTA ACATTAAACA 
5 CATTTTCCAG TTTTTTTATG AATAAATTTA 

AAGAAGGTGA CTATATGTAT AATGGGATAT 
GTCATGACGT TGTATTCAAA TTGCGTAAAA 

10 

GTACGCTTGA TCCCGAAGTT GCAGGCGTGT 
TTAGTGATTA TGTTATGGAT ATGGGCAAAG 
GTACAACGAC TGAAGATCAA ACGGGTGATA 

IS 

ATTTTAATAA GGACGATATT GACCGATT6T 
TTCCGCCXSAT GTACTCATCC GTCAAAGTAA 
ATAATGAAAC AGTTGAAAGA CCAAAGCGTA 

20 

CTGAATTAGA TTTTAAAGAA AATGAOTGTC 
GTACATATAT TAGAACGCTA GCAACT6ATA 

2s TGTCGAAATT AACACGAATC GAGTCTGGTG 

AACAAATAAA AGAACTTCAT GAGCAGGATT 
ATGGATTAAA GGGTTTGCCA AGCATTAAAA 

30 TAAATGGGCA GAAATTTAAT AAAAATGAAT 

TTATTGATGA TGATTCAGAA AAAGTATTAG 
CAGAAATTAA ACCTAAAAAA GTCTTTAATT 
AGtGACACAT CCTATACAAT CTAAACAGTA 
ATTtrrrCGAT GGCATGCATA AAGGTCATGA 
TGAGGCACGC AGTTTAAAAA AAGCGGTGAT 

40 

GAATCCTAAA AGAAAACGAA CAACGTATTT 
TAGCCAACAT GATATTGATT ATTGTATAGT 
GAGCGTAGAA GATTTTGTTG AAAATTATAT 

4S 

TGGTTTTGAT TTTACTTTTG GTAAATTTGG 
TGATGCGTTT AATACGACAA TTGTGAGTAA 
^ AACTTCTATT CGTCAAGATT TAATCAATGG 

CTATATATAT TCTATTAAAG kCACTGTAGT 



GTGTTAGGTA TCTGGAAAAT GTTTGATAAT 14580 

GTTAATACAT AGATGTGTAG AAATAGTTAA 14640 

GTTGATACGC TATTAAAATA TATTTTAAAA 14700 

TACCAGTATA TAAAGAGCXSC GGTTTAACAA 14760 

TATTAAAAAC TAAAAAAATA GGTCACACGG 14 820 

TACCGGTATG TATAGGTAAT GCAACGAGAG 14 880 

CTTATGAAGC AACTGTATCG ATAGGAAGAA 14940 

CATTGGAAAC AAAAGGT6TA CACTCAGCAG 15000 

TAGAAAGTTT TAAA6GTATC ATTGAACAAA 15060 

ATGGTAAAAA ATTATATGAA TATGCGCGTA 15120 

AAGTtAATAT TAAAGACATT GGGCGTATAT 15180 

ATTTTAAAAT AOGCGTCATC TGTGGTAAAG 15240 

TTGGTGTGAA ATTAGGCTTT CC6GCACATA 15300 

GATTTGTGTT GAAAGATAGC CTTACATTAG 15360 

CATTGCAAAA TAAATTGTTT CCTTTAGAAT 15420 

TTAAAGATTC GCACATAAAA AAACGTATTT 15480 

TTGATAACAA AATTAAAGAC CAAATTGTAT 15540 

CAATTTATAT GGTACACCCT ACAAAAGAAT 15600 

AAAGGAGATA GAATTTATGA AAGTCATAGA 15660 

TATTACAGAG GATGTTGCAA TGGCATTCGG 15720 

CAAAGTCTTT GATATATTAA ACGAAATAGC 15780 

GACATTTGAT CCGCATCCGT CTGTCGTGTT 15840 

AACGCCACTT TCAGATAAAA TCGAAAAAAT 15900 

GGTTAATTTT TCATCTAGGT TTGCTAATGT 15960 

AATTAAAAAT AATGTAAAAG AAGTCATTGC 16020 

AAAAGGTAAT ATGACTGTAC TTCAAGAATA 16080 

ACAAGAAATT GAAAATGAAA AAATTTCTAC 16140 

TGAGTTGCAA AAAGCGAATG ATGCTTTAGG 16200 

GCAAGGTGAA AAAAGGGGAA GAACTATTGG 16260 
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TGCTGTTAGT ATTGAAATCG GCACTGAAAA TAAATTATAT CGAGGGGTAG CTAACATAGG 16380 

TGTAAAGCCA ACATTTCATG ATCCTAACAA AGCAGAAGTT GTCATCGAAG TGAATATCTT 16440 

5 

TGACTTTGAG GATAATATTT ATGGTQAACG AGTGACCGTG AATTGGCATC ATTTCTTACG 16500 

TCCTGAGATT AAATTTGATG GTATCGACCC ATTAGTTAAA CAAATGAACG ATGATAAATC 16560 

GCGTGCTAAA TATTTATTAG CAGTTGATTT TGGTGATGAA GTAGCTTATA ATATCTAGAG 16620 

10 

TTGCGTATAG tTATATAAAC AATCTATACC ACACCTTTTT CTTAGTAGGT CGAATCTCCA 16680 

ACGCCTAACT CGGATTAAGG AGTATTCAAA CATTTTAAGG AGGAAATTGA TTATGGCAAT 16740 

TTCACAAGAA CGTAAAAACG AAATCATTAA AGAATACCGT GTACACX3AAA CTGATACTGG 16800 

TTCACCAGAA GTACAAATCG CTGTACTTAC TGCAGAAATC AACGCaGTAA ACGAACACTT 16860 

AOGTACACAC AAAAAAGACC ACCATTCACX3 TCGTGGATTA TTAAAAATGG TAGGTCGTCG 16920 

20 TAGcATTTaT TAAACTACTT ACGTaGTAAA GATATTCAAC GTTACCX3TGA ATTAATTAAA 16980 

TCACTTGGTA TCCGTCGTTA ATCTTAATAT AACGTCTTTG AGGTTGGGGC ATATTTATGT 17040 

TCCAACCTTA ATTTATATTA AAAAAGCTTT TTACAAATAT TAACATTTAT TATATGTTAA 17100 

25 GCTAATATTG AGTGAATAAT AAGGTTACAA TGAGATAAAG ATGATATAAG TACACCTAGA 17160 

GTAATAATCA AGATATTAAA AATAAAGTAT GTTTTTTTAA AAAATATAAC TTATATTTAT 17220 

ACTGATAAGG GTGGGACGAT AAGTCTATTT TGTAAATAAT AGATGGATAT CCCGCTCTCT 17280 

30 

TTTTTTCCAA TTCAATATTT TATAACTAAT ATTAAAATAC GATAATAAAT GATATGATAT 17340 

AACTATTAGA TTCAAGAGAG GAGATTTATA ATGTCTCAAG AAAAGAAAGT TTTTAAAACT 17400 

GAATGGGCAG GAAGATCTTT AACGATTGAA ACAGGGCAAT TAGCTAAACA AGCAAATGGC 17460 

35 

GCTGTATTGG TTCGTTATGG AGATACAGTC GTGTTATCGA CGGCAACTGC ATCAAAAGAA 17520 

CCTCGTGATG GAGATTTCTT CCCATTAACA GTGAACTATG AAGAAAAAAT GTACGCTGCG 17580 

GGTAAAATTC CTGGTGGATT TAAAAAGAGA GAAGGACX3TC CTGGTGACGA TGCAACATTA 17640 

40 

ACTGC6CGAT TAATTGATAG ACCAATTAGA CCTTTATTCC CTAAAGGATA TAAGCATGAT 17700 

GTTCAAATTA TGAACATGGT ATTAAGTGCA GATCCTGATT GTTCACCACA AATG6CTGCA 17760 

^ ATGATTGGTT CATCTATGGC GCTTAGTGTG TCGGATATTC CATTCCAAGG GCCAATCGCC 17820 

GGTGTAAATG TGGGTTATAT TGACGGTAAA TATATCATTA ACCCAACAGT AGAAGAAAAA 17880 

GAAGTTTCTC GTTTAGACCT TGAAGTAGCT GGTCATAAAG ATGCGGTAAA CATGGTAGAG 17940 

50 GCAGGCGCTA GTGAGATTAC TGAACAAGAA ATGTTAGAGG CGATTTTCTT TGGTGATGAA 18000 

GAGATTCAAC GTTTAGTTGA TTTCCAACAA CAAATOGTCG ACCACATTCA ACCTGTTAAA 18060 
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GAAGAAAAAG GACTTAAAGA AACAGTTTTA 
CTTGATAACT TAAAAGAAGA AATCGTCAAT 
^ GAaTTACTTA TTAAAGAAGT TTATGCAATT 

CGTTTAATTG CAGATGAAAA AATTAGACCA 
TTAGATTCTG AAGTTGGTAT TTTACCTAGA 

10 

CAGACTCAAG CACITTCAGT TTTAACATTA 
GGTTTAGGAC CTGAA6AAGA AAAAAGATTC 
^5 GTAGGTGAAA CTGGTCCAGT AC6TGCGCCA 

GGTGAAAGA6 CATTAAAATA TATTATTCCT 
ATTGTAAGTG AGGTACTTGA ATCAAATGGT 
ACATTAGCAT TAATGGATGC GGGCGTACCG 
GGCCTTGTTA CACGTGAAGA TAGCTATACG 
GCATTAGGTG ATATGGACTT TAAAGTCGCT 

2S 

ATGGATATTA AAATTGACGG TTTAACGCGT 
AGACGTGGTC GTTTAGAAAT AATGAATCAT 
GAATTAAGTG CTTACGCGCC AAAAGTTGTA 

30 

GATGTTATCG GACCTGGTGG TAAAAAAATT 
TTAGATATTG AACAAGATGG TACTATCTTT 
35 CGTGCTCGTG AAATCATTGA GGAAATTACA 

GCCACTGTTA AACGTATTGA AAAATACGGT 

GCGTTGCTTC ACATTTCACA AATTTCAAAA 

40 

AAAATCGGTG ACACAATTGA AGTTAAGATT 
GCTTCACATA GAGCATTAGA AGAATAATAT 
TGTGATTTTT TTATGCCACT TTTTACGAAG 

45 

TTTTAAAACG CTTTATTATT TTGTGTGCAA 
ATAGTGTACA TCAAGTGTTT TTTAACTTAT 
SO AACAAATTTA GGAGGTAAGA TTTTGAGTTT 
TATACCATTA GGCGGTGTT6 GCGAAATTGC 
TGAAATGTTT ATGTTAQATG CTGGACTTAT 
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ACATTTGATA AACAACAACG AGATGaAAAT 18180 

GAATTTATCG ATGAAGAAGA TCCAGAGAAT 18240 

TTAAATGAAT TAGTGAAAGA AGAAGTTCGA 18300 

GACGGCCGTA AACCTGATGA AATCCGTCCA 18360 

ACGCATGGTT CAGGTCTATT TACACGTGGT 18420 

GGTGCTTTAG GCGATTATCA ATTAATTGAT 18480 

ATGCATCATT ACAACTTCCC GAATTTTTCA 18540 

GGTCGTCGTG AAATTGGACA TGGTGCGTTA 18600 

GATACTGCTG ATTTCCCATA TACAATTCGT 18660 

TCATCATCTC AAGCGTCAAT TTGTGGATCA 18720 

ATTAAAGCAC CAGTTGCTGG TATTGCTATG 18780 

ATTTTAACTG ATATCCAAGG TATGGAAGAT 18840 

GGTACTAAAG AAGGTATTAC AGCAATCCAA 18900 

GAAATTATCG AAGAGGCTCT AGAACAAGCXS 18960 

ATGTTACAAA CAATTGATCA ACCACGTACT 19020 

ACTATGACAA TTAAACCAGA TAAGATTAGA 19080 

AACGAAATTA TTGATGAAAC AGGTGTTAAA 19140 

ATTGGTGCTG TTGATCAAGC TATGATAAAT 19200 

CGTGAAGCGG AAGTAGGTCA AACTTATCAA 19260 

GCGTTTGTAG GCCTATTCCC AGGTAAAGAT 19320 

AATAGAATTG AAAAAGTGGA AGATGTATTA 19380 

ACTGAAATTG ATAAACAAGG TCX3AGTAAAT 19440 

TTAAAGTCAT ATGACGACAA TGTATCGTCA 19500 

TGACCCGTTT TGAATTTGTT GTATTGAACA 19560 

CTGTTAATTA TCCTGTATGT ATAGTGATTA 19620 

AATGAATAGT GAGTTTATAT ATGGACGGGT 19680 

AATAAAGAAA AAGAATAAAG ATATTCGCAT 19740 

TAAAAATATG TATATCGTTG AAGTAGACGA 19800 

GTTTCCAGAA GACGAAATGC TAGGTATTGA 19860 
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20 



2S 



30 



35 



40 



45 



SO 



CCTTACACAC 

TGCACCAGTA 

TAATATTGAT 

AAACGTGAAT 

TATTCACACT 

ACATGGACAT 

TGTCTTAATC 

GATTGAACAT 

TTATGCTTCG 

TAAAGTGTCA 

GTATTTCGAC 

AAATGAA6TG 

AATGGCGCAA 

AATTACGGCT 

GnCTGGCGCA 

GGAAGAATTA 

TGAATTTAAA 

AAAGATTTTC 

AAATGAAAAG 

AAATATCGTG 

AACdiTAGAT 

TGTATATGTA 

AGTAGAGGCT 

TGATCAAATT 

AATTTCTGAA 

ATATAATGGT 

GATTAATCTG 

CGTCTGTTTG 

TTAACTAAGG 



GGACATGAGC 

TATGGATCTA 

AAAAAAGTTC 

ATTAGTTTCT 

TCATATGGTG 

TATGCACCAG 

AGTGATTCTA 

CATATGTATG 

AACTTTATAC 

TTTTTAGGAA 

ATTCCTAAAG 

ATAATTATAG 

CATAAGCATA 

TCTGCTAATA 

CATATTATTC 

AAAATGATGA 

ATGCAGATAG 

CTTGTGGAAA 

GTAAATTCAG 

TTGAGAGACC 

CCTAAAAATA 

CGTGAAAGTG 

GGTTTACAAG 

AGTAAACTAT 

ATTTAATCAA 

TGTCATGGAC 

TTATCTTAAG 

GACTACATAT 

CAACATAAGG 



ACGCGATTGG 

AATTGACAAT 

GCTACTATAC 

TTAATACGAC 

CCATTGTGTA 

ATATTAAACG 

CTGAGGCAGA 

ATGCTTTTGC 

GTATTGAGCA 

GATCACTTGA 

ATTTGCTAAT 

CTACTGGTAT 

AAATTATGAA 

TGGAAGTTAT 

CAAATAACAA 

TTAATATTAT 

CACATGCGAA 

AAGGAGATGT 

GAAATATTTT 

GTCATCTTTT 

GACGTATAGC 

AAGACTTATT 

AAAAACGCAT 

TATTCGAAAG 

AAAGTCATTA 

AATTTACTTA 

TAAATTGATA 

TCTAAACATC 

AGGTGCGTCA 



TGCAGTGAGT TATGTTTTAG 
AGCGTTAATT AAAGAAAATA 
AGTTAATAAT GATTCAATTA 
ACACAGTATT CCTGATAGTT 
TACAGGTGAA TTTAAGTTTG 
TATGGCAGAG ATTGGTGAAG 
GAAACCTGGA TATAATACTC 
AAAAGTGCGA GGTCGCTTGA 
AGTTTTAAAT ATTGCTAGCA 
AAOTTCATTT AATATTGCTC 
TCCTATAACA GAAGTTGATA 
GCAAGGAGAA CCTGTAGAAG 
TATCGAAGAA GGCGATTCTG 
CATTGCGAAT AcATTAAATG 
AAAGATTCAT GCTTCAAGTC 
GAAACCTGAA TACTTTATTC 
GCTAGCAGCT GAAGCAGGTG 
CATTAATTAC AACGGTAAAG 
AATAGATGGC ATTGGTATTG 
AGCAGAAGAT GGTATCTTTA 
TGCGGGACCT GAAATTCAAT 
ACGTGAAGCA GAAGAGAAAG 
AGAATGGTCT GAAATTAAAC 
TACAAAACGT CGTCCTATGA 
ACATAAAAGA GGTCAGAACA 
TATTTTATGA TAGTCAATTG 
CATAGATGAT ATTGTTCTAA 
AAATAGGAAA TTATATATAA 
ATTGGCACAA GCAAAAAAGA 



AACAATTAGA 

TGAAAGCCCG 

TGAGATTCAA 

TAGGTGTTTG 

ACCAAAGTTT 

AAGGCGTATT 

CGGAAAAXGT 

TAGTTTCATG 

AGCTAAATCG 

GTAAAATGGG 

ATTATCCTAA 

CCTTAAGTCA 

TATTTTTAGC 

AGCtTgTtAC 

ATGGTTGCAT 

CTGTACAAGG 

TTGCACCAGA 

ATATGATATT 

GGGATGTAGG 

TTGCTGTTGT 

CTCGTGGGTT 

TACGTGAAAT 

AAAATATGCG 

TTATTCCAGT 

AGTCACTGAA 

AAGGGGTAAC 

CCTCTTTCAT 

TAACGTCGTT 

AATCXyVCAGC 



19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21460 
21540 
21600 
21660 
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GATACGTTAT GTCATAGCTA TTTTAGTAGT TGTATTAATG GTGTTGGGTG TTTTCCAATT 21780 

AGGAATAATA GGTCGTCTAA TTGACAGCTT CTTTAATTAT TTATTTOGGT ACAGTAGATA 21840 

TTTAACATAT ATTTTAGTAC TCTTAGCAAC TGGTTTTATT ACATACTCTA AACGTATTCC 21900 

TAmaACTAGA CGAACGGCTG GTTCGATTGT ATTGCAAATT GCATTGCTAT TTGTATCACA 21960 

GTTAGnTTT CATTTTAATA GTGGTATCAA AGCTGAAAGA GAACCTGTAC TTTCTTATGT 22020 

GTATCAGTCA TACCAACACA GTCATTTCCC AAATTTTGGT GGCGGTGTAT TAGGCTTTTA 22080 

TTTATTAGAG TTAAGCGTAC CTTTAATTTC ATTATTTGGT GTATGTATTA TTACTATTTT 22140 

ATTATTATGC TCAAGTGTTA TTTTATTAAC AAACCATCAA CATCGTGAAG TTGCAAAAGT 22200 

TGCACTGGAA AATATAAAAG CTTGGTTTGG TTCATTTAAT GAA 22243 
(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

TTATTAATnA TTAATATTTT TATTTTTAAA AATAAAGCGA GGAGCTATCA ATGGAACAAA 60 

TTACTTCTGC ACAAAATAAT AGAATTAAAC AAGCGAACAA GCTAAAAmAG AAACGTGAGA 120 

GGGATAAAAC TGGATTAGCT TTAATTGAAG GTGTGCATTT AATTGAAGAA GCTTATCAAA 180 

GTGGAATTGT AATTACACAA TTATTTGCAA TTGAACCGGC AAGATTAGAT CAGCAAATTA 240 

WCGCATACGC GCAAGAAGTT TTTgAAATAA ACATGAAAGT TGCTGAATCT TTATCAGGTA 300 

CAGTSACACC ACAAGGGTTT TTCGCAATCA TTGAGAAGCC GCATTATGAT ATTTCTAAAG 360 

CACAACAAGT ATTGCTCATC GATCGTGTTC AAGATCCTGG AAATTTAGGC ACATTAATTA 420 

GAACTGCGGA TGCTGCTGGA ATGGATGCTG TAATAATGGA GAAGGGTACG ACAGATCCTT 480 

ATCAAGATAA AGTGTTGCGA GCGAGTCAAG GTAGTGTTTT CCATTTGCCA GTTATGACAC 540 

AAGATCTCGA TACGTTTATT ACTCAATTTA ATGGTCCTGT TTATGGTACA GCACTTGAAA 600 

ACGCAGTGgC ATACAAAGAA GTTACTTCAA GTGATTCTTT TGCATTACTA TTAGGTAATG 660 

AGGGAGAAGG TGTTAATCCT GAATTATTAG CACATACTAC ACAAAATTTA ATCATACCTA 720 

TTTATGGTAA AGCTGAAAGT TTAAATGTAG CGATTGCAGG TAGTATTTTA CTTTATCATT 780 

TGAAAGGTTG ACCGTGTTGA AAGTTTTCCG ATATAATTAT AATTAATTGT TTAACAGAAC 840 
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ATAAATAATT GTTTTAGGGA GAATAATCGT GACTGCAAGT TATTCCAATT ATTTAAAGTC 960 

TTTTCACCTT TTTGGTTACT TAAAGAGATT TAAGTCGGAA AGACAATCCX5 TTATCAATAT 1020 

TAAACAAGTG TATGCTTAGG CATAAATTTG GGTGGTACCA CGGAAATGAC TTTCGTCCCT 1080 

TATTTTTTAA GAGGATGAAA GTCTTTTTTT AGTTAAACAA CAAATATGAT AAATAGAAAA 1140 

TGAATAGTTC GAATAGGGAG GTCAGTGACA TATGTCTGAA CAACAAACAA TGTCAGAGTT 1200 

AAAACAACAA GOGCTTGTAG ATATTAATGA AGCAAATGAT GAACGTGCAC TGCAAGAAGT 1260 

TAAAGTGAAA TACTTAGGTA AAAAAGGGTC AGTTAGCGGA CTAATGAAAT TGATGAAGGA 1320 

TTTGCCGAAT GAAGATAAAC CTGCGTTTGG TCAAAAAGTG AATGAATT6C GTCAAACAAT 1380 

TCAAAATGAA TTAGATGAAA GACAACAGAT GTTAGTTAAA GAAAAATTAA ATAAGCcAAT 1440 

TGGcTGAAGA AACAATTGAT GTATCATTAC CAGGTCGTCA TATTGAAATC GGTTCAAAGC 1500 

20 ATCCATTAAC ACGTACAATA GAAGAAATTG AAGACTTATT CTTAGGTTTA GGTTATGAAA 1560 

TTGTGAATGG ATATGAAGTT GAACAAGATC ATTATAACTT CGAAATGCTG AATTTACCTA 1620 

AATCACACCC TGCACGTGAT ATGCAAGATA GTTTCTATAT TACGGATGAA ATTTTATTAC 1680 

GTACGCATAC ATCACCAGTG CAGGCACGTa CGATGGAATC ACGTCATGGT CAAGGTCCAG 1740 

TTAAAATTAT TTGCCCTGGT AAAGTGTATC GTCGTGACTC TGATGATGCG ACACATAGTC 1800 

ATCAATTTAC ACAAATCGAA GGATTAGTTG TTGATAAAAA CGTTAAAATG AGTGATTTGA 1860 

AAGGTACTTT AGAATTGTTA GCTAAGAAAT TATTTGGTGC TGATCGTGAA ATTCGTTTAC 1920 

GTCCAAGTTA CTTCCCATTC ACTGAACCTT CTGTAGAAGT TGATGTGTCA TGTTTTAAAT 1980 

GTAAAGGAAA AGGTTGTAAT GTGTGTAAAC ACACAGGATG GATTGAAATT TTAGGTGCTG 2040 

GAATGGTACA TCCTAATGTA TTAGAAATGG CTGGTTTTGA TTCTTCAGAG TACTCTGGAT 2100 

TTGp^TTTGG TATGGGACCA GACCGTATTG CAATGTTGAA ATATGGTATA GAAGATATTC 2160 

40 GTCATTTCTA TACTAATGAT GTGAGATTTT TAGATCAATT TAAAGCGGTA GAAGATAGAG 2220 

GTGACATGTA ATGTTGATAT CAAATGAATG GTTGAAAGAA TATGTAACAA TCGATGATTC 2280 

TGTAAGTAAT TTGGCAGAAC GTATTACGCG CACAGGTATT GAAGTGGATG ATTTAATTGA 2340 

CTACACAAAA GATATCAAAA ATTTAGTTGT CGGCTTCGTT AAGTCAAAAG AGAAACATCC 2400 

TGATGCTGAT AAATTAAATG TTTGCCAAGT TGATATCGGA GAAGACGAAC CTGTACAAAT 2460 

CGTTTGTGGT GCACCGAACG TTGaTGCAGG ACAATATGTC ATTGTTGCTA AAGTAGGTGG 2520 

CAGATTGCCT GGTGGTATTA AAATTAAGCG TGCCAAATTA CGCGGTGAAC GTTCAGAAGG 2580 

TATGATTTGT TCGTTACAAG AAATTGGTAT TTCAAGTAAC TATATACCXyV AAAGTTTTGA 2640 
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ATATTTAGAT GATCAAGTAA TGGAATTTGA 
TATGATAGGT ACTGCTTATG AAGTTGCAGC 
^ GACAACATCA AATGAGCTTG ATTTATCTGC 

TGAAGATAAA GTACCATATT ATAGTGCACG 
GCCAATTTGG ATGCAAGCAC GCTTAATAAA 

10 

TGACATTTCA AATTATGTGT TATTAGAATA 
TGCGATTGGT TCACAACAAA TTGTTGTTCG 
ATTAGATGAT ACAGAACX5TG AATTATTAAC 
TCCAATTGCA TTAGCTGGTG TTATGGGTGG 
AAATATAGTG ATTGAAGGTG CTATTTTTGA 
20 TTTAAATTTA CGCAGTGAAT CATCTAGTCG 

AGATGAAGCA GTCGACCGTG CATGTTATTT 
AAAAGATAGA GTGTCTTCAG GAGAACTTGG 
TGATAAAATT AATCGCACTA TTGGATTTGA 
TAATCAACTA GGGTTTGATA CAGAAATAAA 
ACGTCGTAAA GATATTACAA TTAAAGAAGA 

30 

ATACGACGAT ATTCCATCAA CGTTACCTGT 
TGATCGCCAA TATAAAACTA GAATGGTTAA 

^ AGCTATTACG TATTCGTTAG TTTCTAAAGA 
TCAAACAATT GATTTATTGA TGCCAATGA6 
ATTiSCCACAT TTAATCGAAG CGGCATCATA 

40 ATTATTTGAA ATCGGCAATG TCTTCTTTGC 

TGAATATTTA AGTGGTATTT TAACTGGAGA 
AGAAACGGTT GATTTCTATT TAGCAAAAGG 

45 

TCTTGAATTT AGTTATCX5CC GTGCTGATAT 
AATCTTATTA GAGAATAAAG TTGTTGGTTT 
TGATAATGAT TTAAAACGTA CGTATGTTTT 

SO 

GTCGGTAGGT TACATTAATT ACCAGCCAAT 
TGCATTAGAA GTAGATCAAA ATATTCCAGC 

55 



TTTAACGCCG AATCGTGCAG ATGCTTTAAG 2760 

ATTATATAAT ACAAAAATGA CTAAGCCAGA 2820 

AAATGATGAA CTGACTGTGA CAATTGAAAA 2880 

TGTTGTTCAC GACGTGACAA TTGAACCCTC 2940 

AGCGGGTATA CGTCCTATTA ATAATGTTGT 3000 

CGGTCAACCA TTGCACATGT TTGATCAAGA 3060 

TCAAGCTAAT GAAGGCGAAA AAATGACAAC 3120 

GAGCGATATT GTCATTACTA ATGGACAAAC 3180 

CGATTTTTCA GAAGTTAAAG AACAAACATC 3240 

TCCAGTTTCA ATTCGTCATA CATCAAGACG 3300 

TTTTGAAAAA GGAATAGCTA CTGAATTTGT 3360 

ATTACAAACT TATGCAAACG GAAAAGTGCT 3420 

TGCATTTATT ACACCAATCG ACATCACTGC 34 80 

TTTGTCACAA AATGATATTG TTACTATTTT 3540 

TGATGATGTT ATTACAGTGC TAGTACCATC 3600 

TTTAATTGAA GAAGTTGCAC GTATATATGG 3660 

CTTCGATAAA GTTACTAGTG GTCAGCTAAC 3720 

AGAAGTGTTA GAAGGTGCTG GATTAGACCa 3780 

AGATGCTACT GCaTTTTCGA TGCAACAGCG 3840 

TGAAGCGCAT GCX3TCATTAC GTCAAAGTTT 3900 

TAATGTGGCA CGCAAAAATA AAGATGTAAA 3960 

TAATGGAGAA GGTGAACTAC CAGATCAAGT 4020 

TTATGTAGTC AATCAATGGC AAGGTAAGAA 4 080 

TGTCGTGGAT CGAGTATCTG AAAAGTTAAA 414 0 

TGaTGGATTA CATCCAGGTC GTACTGCTGA 4200 

TATTGGTGAA TTACATCCAA TATTAGCAGC 4260 

TGAGTTGAAT TTTGATGCAT TAATGGCTGT 4320 

TCCGAGATTC CCAGGCATGT CTCGT6ACAT 4380 

AGCTGATTTA TTATX:AACGA TTCATGCACA 4440 
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AAAAGGTAAA AAATCAATTG CAATACGTTT AAATTATTTA GACACAGAAG AAACATTGAC 4560 

AGATGAGCGC GTTTCAAAAG TACAAGCGGA AATTGAAGCA GCATTAATTG AACAAGGTGC 4620 

TGTTATTAGA TAATGATTTA AACCCCATGT ATAAG6ATAT CTGAAGTAGA TTGATATCCC 4680 

TAACATGGGG TTTTATTTTT GGGTTCACCA ATTTGGTTCC AATGCATTTA AAAAGTCAAA 474 0 

GAGGAACAGC GGAATACAGA TGATGcTTCG CACAACTGCA TAAAAGCCTC TAATGATTAA 4800 

AAATCAAAGA GGCTTTAAAA TTTTTTGGGC TTTTTCACGA TTTTTAAAAT GCTTTTTTGA 4860 

AATGGTATCT AAACGTGAAA GACCGTATTT TTTTATAATT TTGGCGGCGA TTACATCGAC 4920 

TTTAGCACCX3 GCACCTTTAG GAATCGTCAT ATTAATATTT TTTGATATTT GATCCATATA 4980 

TGTAACAAAT GCGTATCX3AG AAATTATGCT TGCCACTGCA ATOQCTAATG ACTTCGATTC 5040 

TCCTTTTGTT TCAAATTTTG TTTTCTTTGG AAGTGGTATA TCTGATAATG CX3TAATGGCT 5100 

20 ATACACTTCG CGTTTTGCGA ACTGATCAAT GACGATATAG TCTAATTGAG AGGAATCAAT 5160 

TTTTTCAAGT ACATTTTTGA TGGCTTCATT ATGAAGGGCA GCTTTCATTT TTACTTGAGT 5220 

CCAGCCTTTT GCTTGCTGAA TATTATATTT TTCATTGTGT AGTGTTAATA ATGAATGTGG 5280 

TATGAAAGTA ACCAATTGCT CAGCAAGTTC TACAATTTTG GTATCGGTTA ATTTTTTTGA 5340 

ATCATCTACA CCCAAAGTTT TTAAAATAGG GACATGCTCT TTGGTAACGA AAGCAGCACA 54 00 

CACAGTCAAC GGACCAAAGT AATCGCCACT TCCAGCCTCA TCACTACCAA TACAGTTAAA 5460 

TTGrTCATAC ATTAaAGTTg TcCAgAAAAG AATTAGCCAT ATTTnCCTTT 5510 
{2} IKFORMATIOII FOR SEQ ID NO: 166: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9623 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

GnTTATACTT ATAAATTTTA CGGGGGTAAT ATAATACTtA TTTACCTGTA ATATATGATA 60 

ATTCTTCAGC GGCAGCTGCG TTGATAGTTC TATGAGAAAT GATACCTAAT CCTTTAACAT 120 

TGGATTCTGA AATAACGATA GAACCATCAC TGTTAACTTT TTCAACAAAT GCTACATGAC 180 

CGTAATGTTG ATCTGCACCA AATTGTCCAG CCTCAAATAC AACAGCAGCA TGACGTTTTG 240 

GTGTATGACT TACTTGATAA TCACGGTATT GAGCTCGATT ATTCCAATTA TGTGCATCAC 300 

CTAAATCACC TGAGATA6AT GTACCAAATT GTTTCATACG GTTATATACG TACCAAGTAC 360 
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ATGAATCATC ATAATCCTTG ATAGAACGTT 
CGTCAAACTG AGTTAATTGA TAGTGTTTAA 
^ GATCTGTAGC ATATGTTTTA GATAAGTGTG 

ATTTCCATGT TGGTTTATAA ATTGTTCGAT 
AGTAATCTTT TAGTGATTCT TTCGTGCTTG 

10 

ACAATTGATT ACCATCAGCT TCTAATGTGT 
CTTTGATACC GAATAAATTA TGGTTTGGTG 
ATTCTAAGAT TGCTTGGGCA ATCATGACAG 
GATGTGCATC TTTAGCAATT GATTTGACAA 
TAAATTGTCC GCTATCATCA TTGTTAGATA 
20 CACGTGTATC CTTTTGATTA ACATCGTTAT 

TCAATTCATC TTGTGTTGGT AACTGT<5GAT 
TTTTAGATTG AGATGCATAA TCTTTTTGTG 
AAATAGAGTC TAAAGCCGAA TCTGACATTG 
TTGCTTTATC GTCACTTGCT GGTTGACTAT 
AATTTGGTTG CTTATTAGAT GTACTTGGTT 

30 

TGTCTGCTTT ATCTTGTTTA GATGATTGCG 
TGTTTTTATT CGAATCATTT GTTGACTTTT 

^ TATCCGAATT TAAATTGAAT AAGTTTTGGA 

ATTTATTTTT GGTTAGCAAT TQGTTTKTPCI 
CAATSSATATT GTTAGAGTCT OAAGTGCTGT 

40 TGTCTTGGTT ACTTGTATTA TTTTTGTCTG 

TAGAAGTTTC ATCGTCATTA GATTTTTTTG 

CTTTTTGAGG TGTATCAGCA TAAGCGgTAG 

45 

TTGATAGCAA ATAAATTAAA ATTTTATTTT 

TATAATAATT AAGTGTGATA ATAAACTATG 

AGTTGATAGG TATCAATCGA CTAAATATCT 

SO 

GCAAAAATAA ATTAATTTAC AAAAAATATA 
TGTGACAATG AAGAACGCAT TTAAATTATT 

55 



CATATTTATC TAAATCTGGC ATGCGTTCAT 480 

TAATACTGTT TAATTTCTTA GCATAGTTTG 540 

ATGTTGCATC TTTATAAGAA TCGGCTTCCG SOO 

TGCCATCAAT ACCATTTTTA ATAAGGTCAG 660 

GATATTTTCG GAATCCAGCA TTAATACTAT 720 

TAAAAGGAAC AGAATTCCCT TCaAAAGCAC 7B0 

ACwTAGCTAA AGCACTACGA CCTGAGTCAG 840 

ACGCATAAAT ATCX3TTATCT TGACCAATGC 900 

ATTGAOGTGT ATCTTTTGAG TCAACAACGT 960 

TACTAGGATC TGTTTCGAAT AATGATGTTG 1020 

TGAATGATTG AGCAGGTTTA GATTTATGTT 1080 

TCTTTGTATT AGATTTTTCA TTTTTGTCTT 1140 

TTTTCTTTGC ATCTTCACTG TATTGATCCA 1200 

ATTGATTATC TTTCGATGAA GATTTTTGAT 1260 

TTGATTGATT AGGTTGTGTT GGCTTTGGCG 1320 

TT6TATTGTT TGATTTAGGT GCTTTTTGAT 1380 

TATCAGTGTC ATTTTTGATG CTATTGTCAC 1440 

CGCCATTACG AGGTTGTTCG TAATCAGAAA 1500 

TTAAAGTTGT TAATGAGTAA TTATCATOST 1560 

TGGTTTGTGG TA/^TTCTTA TAAATAAAAT 1620 

C6TCTATAGT TTTAAATTTT TTGTCGTTAT 1680 

CTTTATCAAT ATCTTTACTT GTAGTATCCT 1740 

AATCATGAGA TGTTGTCTTA GCTGTAGTAT 1800 

GTGAAaCTAA AGTAGGTAAT ACGAGCGTAG 1860 

TAGGCATATT TCX3TATTCTC CCTTGAAAAA 1920 

ATTTGTTATA ATTTATCGTA TGCTGAAAAT 1980 

TCCA6TAAAT TGATTATACT AATTCACAAC 2040 

TAAAAAATAT GAATAATTCC TACATAGGAG 2100 

TAAAATGGAT CTGAAGAAAG TAGCTAAGAC 2160 
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TAACTTATGG GCAATGTGGG ATCCATATGG 
TAATGAAGAT AAAGGCGACA CAATCAGAGG 
^ TAATACACTC AAGAAAAATA AAAGTTTTGA 
TCATGAGATA AAAATGGGTA AATATTTTGC 
TGAAATTACA GGGACACTAC GTAAGCAGCC 

10 

TCAGAAGATT AACGCTGTTG CX5TCTAAGCT 
AAAAGCGAAT GAACAATTTA ATAAAACAGT 
AGCAGGTTTA ACTATTGAAG AAAATGTGCC 
TTCAGCAGAT AAAGCTTTAC CTAAGATTAA 
TAACCACCAA GCGGATTTAG ATAAATATGC 

20 AGGTGATATT TTAGATGCTC AGAAAAAATT 
TAATGAAAAG GCTAAGTTGA TATTAGCTTT 
GTTAAATTTT GCAGCTGATG ACGTGCCAGC 

25 CATTGCGAGT CAAGGTATTG ATCAAGCTAA 
CACACAAGTT AGAAGTAGAG TCGGTGATTA 
AAATCGAAGA AACCAGCAAC AGATTCCTCA 

30 

TAGTGCACCT GCAGCTGGTA ATGGTGTAGC 
TACTGCACCA AATAATAATG TTACGCAAAA 

35 ATCGACTACA CCACAAAGTA CAAGCGGGAA 
AACAACACAA 6TCAGCACAG CTAACGAGAA 
ATCASSXSGAA GCGGCATTAA CGGGCTCTTT 

40 AGCGAAAGCC GCACAAAAAG ATAGTCAGGC 
ATCGGACAAG CCTTCTGATT TTAGAGAGTC 
CACAACGCAA TATAATCAAC AATTTATCGA 

45 

TGTTGATTTA TCAAAAGAAA TTGATAAGGT 
ATTAAGGTTA GTTAATCAAT TAAGCAATGC 
AGCTACTAAA TTACTAGATC AACTTTCAAA 

SO 

TTATGTTAAA AAA6ATCTTA ACAGCTCTTT 
ATTGAACAAA GGGCAAACTG CATTATCCAA 

55 



CAACACGGGA CACATCAAGG TCGCAGTCGT 2280 

GAAAAAAGTT AATGTCGGTA ATACGATGGT 2340 

TTGGCAGTTT GTAAGTAGAG AGAAAGCTGA 2400 

AGGTATTTAC ATCCCATCTA AGTTTACACA 2460 

TCAAAAAGCA GATGTAGAAT TTAAGGTGAA 2520 

AACAGATACT GGTTCGTCAG TTGTCGTTGA 2580 

AACTCGAGCA TTATTAGAAG AAGCTAACAA 2640 

GACAATTAAC AAGATAAAAA ATGCXKTTATA 2700 

TGACTTTGCX3 AATAAAATTG TATATTTGAA 2760 

CAATGATTTT AGAAAACTAG GAAATTATAA 2820 

AAACGAaGTC AATGGTGCTA TTCCGCAACT 2880 

AAATAATTAT ATGCCGAAAA TTGAAAAAGC 2940 

GCAGTTCCCT AAAATTAATC AAGGACTTAA 3000 

TGGACAGTTA AATGATGCCA AAGGCTTCGT 3060 

TCAAGATGCA ATTCGACGCG CGCAAGATTT 3120 

AAATAGCGCG GCGAACAACG AAACATCAAA 3180 

ATCAACGCCA CCAAGTGCT^C CAAGTGGCGA 3240 

TACCGCACCA AATAGTAATA ATGCGCCTGT 3300 

AAAAGATGGT CAAAGTTTTQ TAGATATAAC 3360 

CACACAAAAC ATTACAGATA AAGATGTTAA 3420 

ATTATCATTA TCAAATAATT TAGATACCCA 3480 

ATTACGTAAT ATTTCGTATG GGATTTTAGC 3540 

TTTAGATAAT GTTAAGTCCG GTTTAGAATA 3600 

TACATTAAAA GAGATTGAGA AGAATGAAAA 3660 

AAAAGCAGCT AATAATCGAA TTAATGAATC 3720 

ATTAAAGAAT GGTAGTTCAG GAACTGCTGA 3780 

ACTA6ATTCA TCATTATCAT CATTTAGAGA 3840 

AGTATCAATA TCACAACXSTA TTATGGATGA 3900 

TGTTCAGTCT AAATTAAATA CAATTGATCA 3960 
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AACAGTATTA CCAAGTATTG AACAACAATA 


CATTAGTGCT 


GTTAAAAATG 


CTCAAGCAAA 


4080 


CTTCTCGAAA GTGAAAAGTG ATGTAGcTAA 


AGCTGCTAAC 


TTTGTGCGCA 


ATGACTTACC 


4X40 


ACAGTTAGAA CAGCGATTAA 


CTAATGCGAC 


AGCAAGTGTG 


AATAAAAATT 


TACCAACGTT 


4200 


ATTAAATGGT TATGATCAAG 


CGGTAGGATT 


ACTAAATAAA 


AATCAGCCAC 


AAGCGAAAAA 


4260 


GGCTTTATCA GATTTAGCTG 


ATTTTTCTCA 


AAATAAATTG 


CCTGATGTTG 


AAAAAGATTT 


4320 


GAAAAAAGCG AATAAAATTT 


TCAAGAAATT 


AGACAAAGAT 


GATGCAGTCG 


ACAAATTAAT 


4380 


CGACACACTT AAGAATGATT 


TGAAAAAGCA 


AGCGGGTATT 


ATTGCAAATC 


CTATTAATAA 


4440 


GAAGACTQTT GATGTTTTCC 


CAGTTAAGGA 


TTATGGTTCA 


GGTATGACAC 


CATTCTATAC 


4500 


TGCACTGTCA GTATG06TAG 


QTGCACTCTT 


GATGGTAAGT 


TTATTAACGG 


TTGATAATAA 


4560 


ACATAAGAGT CTAGAGTCAG 


TCTTAACGAC 


AAGACAAGTG 


TTCTTAGGTA 


AGGCAGGATT 


4620 


CTTTATAATG CTTGGTATGT 


TGCAAGCACT 


CATTGTATCG 


GTTGGAGATT 


TGTTAATCCT 


4680 


AAAAGCAGGA GTTGAGTCAC 


CTGTATTATT 


TGTACTTATA 


ACGATTTTCT 


GTTCGATTAT 


4740 


TTTCAACTCA ATCGTATATA 


CGTGCGTATC 


ATTACTTGGT 


AACCCAGGTA 


AAGCCATTGC 


4800 


AATCGTATTG CTTGTATTAC 


AAATTGCAGG 


TGGTGGGGGA 


ACATTCCCAA 


TTCAAACTAC 


4860 


GCCACAATTT TTCCAAAACA 


TTTCGCCATA 


CTTACCATTT 


ACGTATGCAA 


TTGATTCATT 


4920 


ACGTGAAACA GTAGGCGGTA TTGTTCCGGA 


AATCCTAATT 


ACAAAATTAA 


TTATATTAAC 


4980 


GTTATTTGGT ATAGGATTCT 


TCGTTGTAGG 


TTTAATTTTA 


AAACCTGTAA 


CAGATCCATT 


5040 


GATGAAGCX3C GTATCTGAAA AAGTTGACCA 


AAGTAACGTT 


ACAGAATAAA 


AATTAAATCC 


5100 


ACACATTAGG GTTATAGCTC 


CTTAATGTGT 


GGATTTTTAT 


GTTTTTAGAC 


AGAAGAGATA 


5160 


GTAATTTCTG TCTTTTATGG GACGGTTGTT 


ATCATTGCTA 


TTATCCAGGA 


TGACTTACTA 


5220 


TAGCaCTAAT ATTACCGACA 


AAGTGAATAT 


CCTCGTCTTC 


CGTAGTTAAA 


ATAAAGCTAG 


5280 


AACCTTTTTG GATGTCATAG 


TGCTTATCGT 


TTACTGTTAA 


AGTACCAGTA 


CCATCGATAA 


5340 


TTGTAACTAA GCAATAAGCA 


TGTGGTTTAT 


TGAATTTTAA 


ATCTCCATGA 


ATATCCCATT 


5400 


TATATACTGC AAAATATTGA 


TTATCTACAA 


ATTGAGTTAC 


AGTGTGTGTG 


TCGATGTGAG 


5460 


TTGTTATAGG AGTAGTATTT 


GGTTCATGAT 


TGCCTAATTC 


AATCACATCT 


TTACTTTGCT 


5520 


CTAAGTGCAA ATCACGCAAT 


TGACCATTTT 


GATCTCGTCT 


ATCATAGTCA 


TAAATACGGT 


5560 


ATGTCGTATC GGAGGATTGT 




AAATTAAAAT 


ACCCGAACCA 


ATGGCATGGA 


5640 


CAGTGCCAGC AGGAACATAA 


TAAAAGTCAC 


CGGGCTTAAC 


AGGTATACGT 


TTGAAAAGAC 


5700 


TGTCAAATTC ATGATTATCA ATCATGTCTA 


TTAACGTCTG 


TTTATTATGT 


GCATGTACGC 


5760 
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GTTCGCCTTC GTGTTTTAAA GCGTAGTCAT 
CATTGGCATC TAATACTTTA GTTAGCAGAG 
^ ATTCACGATG TTGTGACCAA AGTTGATCTA 

TTGTATTAGG ACCATTTGGA TGTGCAGAAA 
GGATATCATA GTTAAATGCT TTTAATGCAT 

10 

GTTGTAAAAA TAATGCCATA GTTAAAACTC 
TCTGTAGTAC TGTTTGCATT AATTAGTGAT 
GATAAGCGCT GAAGTATTTT TAAATGTGTA 
AATATCAATT GAGGTAGACT ACCATCTAGA 
ATAACAGCTA CAATCGGTTG TTTTACAACA 
20 ATGCCAATAG CTGTCGTAGm tCcATTTCAC 
TGTGCTCTAC ATAACGGCAA ATTTTAAGTT 
GAGACATGTC GTGATCAGTA ATTATCATAG 
TGAGATGTGA ATGTTTCGCG GTGTTATCTA 
TATCTGTATC ATGAAGTTGC GTGTGTTGCG 
TGCGTTTTAA TAATAGTACA GTAGTCATTG 

30 

TAAACCATAA TACATGATCA ATACCACCTA 
CTCTATCGCC GACACCACCA ATGGCTGCAA 
TTGCAGGTAT AATGCX5CAAT GGATCTTGGG 

35 

CAAATAGTCC CATAGTGAAG GAAGCCTTAC 
ACTTTTGAAC AAACX5TTGCT AAACCTAAAC 

40 CCATACCCAT AACGGCGTAA TTACCTTCAG 
CCTTGTTTAC TGGACCGCCC ATATCGAAGG 
TAATAATATT AGCACCTTGC ATACTTTTTA 

^ AAATTGGTGC ACCGATTAAA AATATAAATA 
GAATAATAAT GATAGGCATA ATTGGTGCCA 
' ACTTTGCGAT ATAACCTGCT AAGAAACCAG 

50 

CATCACTGCC ATAAAAACTA CCGTCAGCAG 
GACCGGGCTT GTCAGCGATA CTAACAGCX3A 

55 



CATCTGGGTG AACTTGAACA GATAATTTAT 


5880 


GGAAACTATC 


TCGTGAATCA 


TTATCGAATA 


5940 


GGGTCATATC 


CTTGTATGGA 


CCATTGATAA 


6000 


TTGCCCAGCA 


TTCACCAGTT 


GTTTCATTAG 


6060 


GACCGCCCCA AATTCTGTCT TTAAAAACGG 


6120 


CTCTATATTT 


TCATTAATAA 


GTTATAAATT 


6180 


TGGCGTGTCT 


CATCATTCAT 


TAACX5CTTTA 


6240 


TCCTGACTGT 


TGTTTGGTAC 


GGCAATTAAG 


6300 


CTGTCCCATT TAACACCATG ATTATTTTTC 


6360 


TCAGACTTTG 


CAT6TGGAAT 


GGCCACGTTC 


6420 


GTTCTAGTAT 


TGCATTTTTT 


AAATGCGATG 


6480 


TATGAATCAA 


CATATCAATT 


GCTTCGTTTC 


6540 


TTTGTTGATC 


AAAAACATGA 


GAAGGTTTAT 


6600 


CATTGTCAAC 


CTCTGTATCA 


TGTTGTGTAA 


6660 


CTGGTGCATC 


TACTGCTATA 


ACTGGTGTAT 


6720 


TGACAAGACT ACCTACTATC 


ACTGCAAAGA 


6780 


ATACAGCCAC 


GATTGGACCT 


CCATGTGCGA 


6840 


TGACTGATGC 


AATCATTGCA 


CCAATGATGT 


6900 


CTGCGAAAGG AATAGCACCT 


TCAGTAATAC 


6960 


CCATTTCTCT 


TTCGGAATGA 


TTGAATTTAT 


7020 


CGATTGGTGG 


TGTACATACA 


GCAACTGCGA 


7080 


CAATAAGTGC 


TGAGCCAAAT 


AAAAATGCTA 


7140 


CAATCATCGC ACCTATAATC ATCGCAAGTA 


7200 


ACCAGGTTGT 


TAATGCCTCA AAAATATTAG 


7260 


TCAATCCTAC 


AACGACCGAT 


GAAATAATGG 


7320 


TTGCTTTTGG 


AACTTTAATA 


TCTTTAATCC 


7380 


CAACAATACC 


ACCTAAAAAT 


CCTGCGCCTG 


7440 


CGATAGCGCC 


GCCAATCATA 


CCAGGAACAA 


7500 


TATATCCAGC 


TA6TATTGGA ACCATAAATT 


7560 
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ATCCTTTTGA TGTCGTTtCA CCGCCTAGAG TCAGCGCGAT GGCGATAAGG AGTCCACCAA 7680 

CTACGATAAA AGGAACCATA AACGATACAC CGTTCATTAA ATGTTGATAC ACCATTTGAA 7740 

5 

TACCATTTTT AGACTTACCG CGATCTTTCG AATGATAATT TGTTTCAGAT TGATAAATAG 7800 

GCGCATCTTG ATTAATGATA CGTTGAATTA GACCTCTCGG ATTATGAATC CCTTCGCGAA 7860 

CATTTTCATT AATC7ACCGT TTACCAACAA ATCGGGACAG ATCAACTTGT TTATCAGCTG 7920 

10 

CAATTATGAC ACCGTCAGCT TCTTCXSATGT CTTGCGTAGT TAAAACATTT TCAGCACCAA 7980 

CACCGCCCTG TGTCTCTACT TTAATATCCA CACCCATTTC TTTTGCTACC TGCTCAAGCT B040 

,5 TTTCTTGAGC CATATATGTA TGT6CAATGC CATTTGGGCA TGAGGTAATA GCTACAATTT 8100 

TCATAAAATC ATCTCCTTTT CTATATTGTA AGCGTATTCT CGATACTAAA AAAAAGAATA 8160 

ATTACCGTTA CTAGTGGCAA TTATTCTTGT AAGTATTCAA ATAACTGTTG CTTTAAACTA 8220 

20 TGATCATCTA AACTACATAA ATGGTTCACT GAATCATCAT CCAAGTTAGC AATTAATT6C 8280 

ATCATTTGTT TTGTAAAAGC TTTGTCTTTA TGCGAAATCG CTAAGAAAAA GACAAGTTTG 8340 

ACATCGTGTT GTCGCCAAGG AAAAACATCT TTTGTGCGAA AAATAAGCAC ATGTGATTGT 84 00 

2S 

AAAACTTTTT CAGGATCTCC ATGAGGAATC GCCATAAAAT TACCTATGTA TGTAGAAGAT 8460 

GATTTCTCAC GCTCTAAAGC TGATTCGATA TATCCTTCTA CAATCGCATG ATGTGCTTGT 8520 

AATATTTTTT GAGCTTCTTC AAAAATTTGC ACAGTATGCC GTGATTTTTG TTCAGTATTT 8580 

30 

ACGACAAGGA AATTGACAGT GTCCATATGA TGATGTGCTT GAACCGGATT TTGCTTTTGC 8640 

TTCACAACGT GTCTGATTTT GTGACGATCA TCTTCAGAAA ATAATGGTGC AACCTTGATA 8700 

GTCGTCAGGT GCTTAOGAAG TATGTTTAGC GTTTGTTTAG GAATATCATG GGTCGTTATT 8760 

AATAAATCTA CATTGTCAAA GTGATAGTGT GTTATATTTT CTAGTTTAAT CGTATTTATC 8820 

ACTG&CAACT CTTCGGATAA GTTATTTATT TTAGTTTCTA AAAAATTCGA CACACCTAGA 8880 

40 CCATAATAAC AAGCAATGAC TACATTTAAT TGTGTTTTGG TACGACGCTC GATGGCAGCT 8940 

TGAAAATGAA TTGTTAAAAA TGCAATTTCA TCTTOGCTCA TCTCTATATC AGTATCAATT 9000 

GCTAATTTAT CAATCGCTTC AAAAAGTGTG TTAAACACAA AGGGATAGAG TTTTTTAATC 9060 

TCTATAACTA AAGGATTGTT TAAATAAATG TTTTGAGTGA TACGTAAATA TGCTTTACTA 9120 

AAATGATTAT ATAAATTTTG TTGTAAAATC GAATCTTCAT TGAAAGGTAC ATGAATACGT 9180 

TGCTGCATCA ATTCGATTAA GCGATCAATA TAACTTTGTA TAAATATACG TTCTATGCCA 9240 

SO 

ATATCGAGTT TATTAAAATG ATAAGCAATA AAGAATGAAA ACATATTGAT TACTTTTTCG 9300 

TTCAAGTCAT AACCTAATCT TTCGTTGATT TGCTTAATGC AAGATTGAGA TATCAATTTT 9360 
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AGATGAATTA AAAGCTGTTG TATTTGAATA TCAGTTGTTT CAATACTATG TTGTTGAAGT 9480 
GTCTCTTGTA TAATATGCGA AATCATCCTT TGGTGTGAAT CAGGTAATTC aTTTAAAATT 9540 

5 

AGGTCTTCAA CATGTACATG CCCTGATGAT AATTGATTTA AATGGATGAT GGCATTAGTG 9600 
ATATCATTAT CTGTTCCATC GAC 9623 
(2) INFORMATION FOR SEQ ID NO: 167: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
75 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



20 



35 



45 



ACCGTGGAAA 


CACGTCTAGT 


CAATCAGAAA 


GCGATAAAAA 


TGTGACTAAA 


TCATCTCAAG 


60 


AGGAAAATCA 


AGCAAAAGAA 


GAATTACAAA 


GCGTTTTAAA 


CAAAATTAAC 


AAACAATCAA 


120 


GTAAGAATAA 


TTAAAAAATT 


TTGATATTGT 


CTATGTTTAT 


AGTTCACAAG 


CCATTCAACG 


180 






o Inl 1x1111 


AATAGTAATT 


TGTCAGGAGG 


TGCCTATCTA 


240 


TGGAAGAACA 


TTACTACGTA 


AGTATTGATA 


TTGGATCATC AAGCGTAAAA ACAATAGTAG 


300 


GCGAGAAATT 


TCACAATGGT 


ATAAATGTGA 


TAGGTACAGG 


ACAAACCTAC 


ACGAGCGGTA 


360 


TAAAAAATGG 


TTTAATTGAT 


GATTTTGATA 


TTGCGCGACA 


AGCAATCAAA 


GACACAATTA 


420 


AAAAGGCATC 


AATCGCTTCG 


GGTGTTGATA 


TTAAAGAAGT 


TTTCCTGAAA 


TTACCTATCA 


480 


TTGGAACGGA 


AGTTTATGAT 


GAATCAAATG 


AAATCGACTT 


TTATGAOGAT 


ACAGAAATCA 


540 


ACGGTTCACA 


TATCGAAAAA 


GTATTAGAAG 


GTATTAGAGA AAAAAATGAT 


GTGCAAGAAA 


600 


cag;&gtaat 


TAATGTGTTC 


CCGATTC6TT 


TTATAGTCGA 


TAAAGAAAAT 


GAGGTTTCAG 


660 


ACCCTAAAGA 


ATTAATTGCC 


AGACATTCAT 


TAAAGGTTGA AGCAGGCGTA ATTGCTATTC 


720 


AAAAATCGAT 


TTTAATTAAT 


ATGATTAAAT 


GCGTAGAAGC 


ATGTGGTGTT 


GATGTATTAG 


780 


ATGTTTACTC 


TGATGCATAT 


AACTATGGTT 


CAATCCTAAC 


AGCTACTGAA 


AAAGAGTTAG 


840 


GTGCATGTGT 


CATTGATATT 


GGTGAAGACG 


TTACGCAAGT 


TGCTTTTTAT 


GAACGCGGTG 


900 


AATTAGTAGA 


TGCTGATTCT 


ATCGAAATGG 


CAGGGCGTGA 


TATTACaGAC 


GATaTTGCAC 


960 


aAGGrTTaAA 


CACTTCTnAT 


GAAACTGCTG 


nAAAAAGTTA 


AACACCAATn 


TGGTCATGCA 


1020 


T 












1021 



(2) INFORMATION FOR SEQ ID NO: 168: 
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(A) LENGTH: 7963 base pairs 

(B) TYPE: nucleic acid 
(C} STRANDEDNESS : double 
(D) TOPOLOGY: linear 

s 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

TAATCTATTA TAAAAACTGT CCATACCCTT TGATTACCTT CTCTTCAGGT ACAGGCCACA 60 

CTTGAGGCCA TAAGCCATAT GCTTGCTGTG AATAAAATTG TGCCATTTGT AACAATATAA 120 

TATATACAAA TAAACACCCA ATAATTGCTG TCACTAATGG ATATGATAAC CAAACCATTA 180 

IS ATAAAACTGC AATAATTACT AACCTAAAGA TAATATTAAA TGCGTCTCTC CCTCTTATAA 240 

AGCTTCTAAT AAATAAGAAT AAATACATCG CATTAGAGTT AAATTTACTA CCCTTTGGAA 300 

CTGGTAAAAG TATATCTAGA TAACTTCTTC TGACTGCAGA TTCTTTCAAA TGTTTTACAT 360 

20 

CGGTGAACAT ATTAACAAAT TTATAATAAT TCATATGATG TCGATGTTCG ATTGCAATCA 420 

TTTTCTCCCA AGGATACAAA AAGCCTGGTT TATATTTTTT AACTAAAAAT TCTATTAACA 480 

CAGGCAAAGC AACCATCACA AATGCGATGT ACCATTTTGG AGCTAATAGT AAGTAATATG 540 

25 

TTAGAGCAAA GGTGATGAAT GATATTAAAT TAACTTGCCA TGTTTTAAGT CCCGATTGAT 600 

ACCATTGCCA TCTTAAGCGT AAACCAACAT ATGGAAAAAT TAATGCACTG ACTCCAAAAC 660 

AAATATAAAA TGCCACATTA TGTTGATTAA TATTGTAAAA CAACGGGAAC ATTACAATAA 720 

CAATAATGAG TTGGATTAAT ATGCGCGCAA AGTAACTATA TAAAATCGCA TGACGCATAA 780 

ATTGAGACAT GTGTTTTTCA AATGGTAATA AAAAGATTTT ATCCgCTTCT TTTAACAGTG 840 

35 GTCsCmTTGG AAAAATAGrT GTCAACGCAA CAATCACTGC TGCTATTaAT GAAAAATTGa 900 

TATTCGTTGG AATATGTTTT AACCATTcAC CATATCCArA AATAAATGCA CCCAGCAAAA 960 

TAAGTAAAAA GACCATGAAA TGACCATTAA ATATAAACTT ATTATAATAA TTTTtCTCTT 1020 

TACGAAGGGC ATGTAATCTT TTATTAAATA ATGTGGTAGC TTGGTTACGC ATGTACATCT 1080 

CCACCTTGCG TCACATGAAT ATATATATCG TCTAATGTTT GATTATGTAA GCCAGTTTGT 1140 

TGTCTCAATG CTTCTAAATC TCCAAATGCA ACGACTTCAC CTTCGTCTAG TATGaTAAAA 1200 

45 

CGATCACAGT AACGTTCAGC TGTTGCTAAA ATATGTGTAC TCATTAGAAC GGTTCTACCT 1260 

TCGTTTTTCT TTTCAACCAT TAAATCTAAC ATGGATTGAA TTCCTAATGG ATCTAGGCCA 1320 

AGGAATGGTT CGTCTATAAT ATACAATTC6 GGATTAACGA TAAACGCACA AATAATCATG 1380 

50 

ACTTTTTGTT TCATCCCCTT AGAAAAATGA CTCGGAAAAA CTTTCAACTC ATTTTCTAAA 1440 

CGGAATGTCT TTAATAATGG CATTGCTCGA TTCATCGTTT CATCACGATC AATATCATAT 1500 
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TCCGGAATAT AAGATAACTT TCTTCTATAA GCCTCTATGT CATCATTAAT GTTGATATCT 1620 

GAAATTGATA GAGATCCTTC CATAGGTGTA AGCAATCCTA GCATATGTTT AATCGTTGTA IS 80 

CTCTTACCAG CGCCATTAAG GCCAATAAGT CCAACAATTT CGCCTTTGTT TAATTCAAAA 1740 

TTTATATCTT TAATTACAGG GCGTTTTCCA TATCCACCTG TAAGCTGTTC TACTTTAACT 1800 

GTCATAAGGC ACCTCCATGA CTTATATTGT ACCAAAAATT ATAAAATGCT CATATTAAAT I860 

ACACATGTCC TAATATCGAA TTTTTAGCGA CAATGTTATA ATGAATGGTA ATACTAGTTG 1920 

AAAAGGAGTG TAGTCATCAT GTCAGAAACA ATTTTCGGCA AAATTTTAAC TGGAGAAATT 1980 

CCTAGCTTTA AAGTATATGA AGACGATTAT GTCTATGCCT TTTTAGATAT ATCACAAGTT 2040 

ACTAAAGGAC ATACGTTATT AATTCCTAAA AAAGCTTCTG CTAATATCTT TGAAACTGAT 2100 

GAAGAAACAA TGAAACATAT CGGTGCAGCA TTACCTAAAG TAGCAAATGC TATTAAGCGT 2160 

20 GCATTTAATC CTGATGGTTT AAACATTATT CAAAATAATG GTGAGTTTGC AGATCAATCT 2220 

GTATTTCATA TTCATTTCCA CTTAATTCCT CGATACXyUU^ ATGATATTGA TGGATTTG6T 2280 

TATAAGTGGG AAACACATGA AGACATTTTA GATAACGATG CAAAACAACA AATTCCTGAA 2340 

25 CAAATTCAAG CACAATTTTA AATGTATGCT TAATCTAAGC TCGAACGGGT ATAATATGAT 2400 

TAATATTATA ACAATTGCGT TTGAAGTGAT AACATCAAGG TTAGCAATTT TAAACAAAAT 2460 

GAGTTATCAA GATAACAGAT GTTAAAAGTG AGGAGAATAT AAATGAAAGC ATCACGCATT 2520 

CTATTCGGTA TCGGTGTTGG CGTAGCAGCT GGTTTTGTAG TTGCACTTCA AGGACGTGAC 2580 

GACAAAAGTG TCAAGAACAA CACGATCGAT CGTACTGCCC CTACTGGTTC AAAATCAGAA 264 0 

CTACAACGTG AATTTGAAAC GATTAAACAA AGTTTTAATG ACATTTTAAA CTATGGTGTT 2700 

CAAATTAAAA ACGAAAGTGC GGAATTTGGT AGTTCAATTG GTGGTGAAAT TAAGTCATTA 2760 

CTTC^AAACT TCAAATCTGA CATTAATCCT AATATTGAAC OTTTACAGTC ACACATCGAA 2820 

AATTTACAAA ATCGTGGCGA GGATATTGGA AACGAAATTT CTAAGTAGCA GGTTACGTTC 2880 

TOGATCACAA CTATTTTTAT TAGTAACAGC ATATTTATTT TTTAAAATTA AATGCCAAAT 2940 

AAACGAGATG ACATTAGAAA TTAGATATTT CTTGTCATCT CTTTTTTAAA ACTCAAATGA 3000 

45 ACTTATGTTT ACAAATTATA GGAAGACATT GTTTGTAGTG ATTTTCGCTT AAATCATATT 3060 

TATGAATTGA TTGAAAACAT TGCTTAGGAT TCATTGTGTT ATCCtTGCAC TTTGATTACG 3120 

CTTTACTTAA ATCATTATCG ACAAACAACA TACTTATATT TTCATTGAGC CGAACCTTAT 3180 

50 ATACACATTA CATATACCTT ACTTGCACAA ATTATTAATC TGGTGTTTAT TATAATTACA 3240 

TATCACTATA TTTTTAGCAT TTGTATAACT TAGTTGGTCA AAAGATGCTT TTGCATATGC 3300 



SS 



30 



35 



40 
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TTTCATAAGT GATGCTTTAT TAGCAAGAAT ATGTGTTCGC AGAAATTTGT TCTGCATTCT 3420 

ACTTCTACGC TAGTCAATCA GACAATTTTA CCAATCCCCA CTTTCGCGTT TCAAATCAAA 3480 

5 CAATACGTCG CTCCTTTCTT CTTATATAAC AATTCTTCTA ACATGATATG TTACTATTGA 3540 

ATTACTGAAC CTGAGTTAGT TATAATCTAA CTTATATTGA AAAGAGATGA GGCGTAAGAT 3600 

ATGTTTTTAT GTAAAAGACA AATTGATATC AATGCACGAT TTGGTTTGCC TAGAATTGCA 3660 

10 

TTTATGAGTG CAGTTGCAAC CATCATTATG TTTTTAGTTA GTTATGAAGT AATGTATTTT 3720 

TTATCTAATA CGCCATTATC AGATAGACAT TTTCTCATCT TTTTATTACT TGTATTTATG 3780 

ACGTATCCAT TACATAAAAG TATACATTTA TTATTTTTCT TACCATATAG AAAATCGITT 3840 

IS 

AAAGTTCATA AGTTAACTAA AAGAAAATGG CTTATATTCT ATAATACCTA CGTCAATCAA 3900 

CCTGTACACA AATTTTATTT TTGCATTAAC TTAATATTGC CGTTAATTAT CTTATCTGCA 3960 

ATGTTCGTTT ATCTAACAAT TTCATTCCCG CAATATGGAC ATTATTTTAT GTTCTTATTG 4020 

20 

GCATTGAATT TCGGTATTTC CATTACAGAT TTATTATATT TAAAAATAAT TATATTTTCT 4080 

AATTATGGAC AATATATAGA AGAACATAGT ACAGGTATTA ATATTTTGAA AAAAATTAAA 4140 

2s AATCCATATC ATTTATAACA AAATAATTAT AGCAAGGTGT TATTATTTGT TTTTAGGCTA 4200 

TGTAATAgcT tACAATCAAA TGTATATAGA CCTTGTTTTT TTATTTTCAT CAATTTCTAC 4260 

CCCTAAACCT AATGCTCTAG TCTGATGTCA TGGGTTATTG ATTGGTGATA ATATAAAACT 4320 

30 ATGTTATATT CACGATGATT AACTTACAAA GGAGTTTCAA CTATGAAGAT GATAAACAAA 4380 

TTAATCGTTC CGGTAACAGC TAGTGCTTTA TTATTAGGCG CTTGTGGCgC TAGTGCCACA 4440 

GACTCTAAAG AAAATACATT AATTTCTTCT AAAGCTGGAG ACGTAACAGT TGCAGATACA 4500 

^ ATGAAAAAAA TCGGTAAAGA TCAAATTGCA AATGCATCAT TTACTGAAAT GTTAAATAAA 4560 

ATTCTAGCTG ATAAATATAA AAATAAAGTT AATGATAAGA AGATTGACX3A ACAAATTGAA 4620 

AAAATGCAAA AGCAATACGG CGGTAAAGAT AAATTTGAAA AGGCCCTTCA ACAGCAAGGT 4680 

40 

TTAACAGCCG ATAAATATAA AGAAAATTTA CGTACTGCTG CTTATCATAA AGAATTACTA 4740 

TCAGATAAAA TTAAAATCTC TGATTCTGAA ATTAAAGAAG ACAGCArGAA AGCTTCACAC 4800 

ATTTTAATTA AAGTTAAATC TAAGAAAAGC GACmAAGAAG GCTTAGATGA TAAAGAAGCG 4860 

45 

AAACAAAAAG CTGAA6AAAT TCAAAAAGAA GTTTCAAAAG ATCCAAGTAA ATTTGGTGAA 4920 

ATCGCTAAAA AAGAATCAAT GGATACTGGT TCAGCTAAAA AAGATGGCGA ATTAGGTTAT 4980 

so GTTCTTAAAG GACAAACTGA TAAAGATTTT GAAAAAGCAC TATTTAAGCT TAAAGATGGT 5040 

GAAGTATCAG AGGTTGTTAA ATCAAGCTTT GGATATCATA TTATTAAAGC TGATAAACCA 5100 

SS 



825 



AAAAATCCAA AATTATTGAC TGATGCATAC 
TTTAAAGATC GTGATATTAA ATCAGTTGTC 
^ AAACAAGGTG GCGCACAAGG CGGACAATCC 

CCGTGGTTCA AAAATCATAC CACGGCCGCT 
GAGCTCATGT TTCAGTATAC TCATCTGTCC 

10 

AGGATTGTAG AATCTACGAT TTTCAAGACC 
AGTTTTTTTA TATGCCTTTT CAAACATATT 
TAAAATTTCT GCTTCTTrTA AGTATGGCAG 

75 

ATGAGATAAA ATCATATGTC TTAACAACAT 
AGCTGCTTCA ACTACTTCAT CACTCGCAAT 
TGTATACGAC GTC6CAACAG GACCACTCAA 
AATACCACTA TATAACAAAC TTTTGTTTAA 
AATACGTAAC ATCGTTAATA CATGATAGCT 
2S ACTAGCAGCT GGATATGTGT AAAATCGTTC 
ACGTTGTAAA TTAGCATTTT CAATATCTAG 
TGCCGGTGAT AAAGGTGCAC CATCTACAAA 
CGCTAGTCTA ATTTGGTTGA CTTTCATCTG 
TTTAACATGT ACAATTTCTT CAGGCTTGAT 

AAATTTCGCT TCAATTTCAC CACTTTTATC 

35 

TTGTGCTGTT ACACCCTGTG TAGCTTTATG 

GGGWTTAGA TTCTCTATAT TTCTCATCGT 

ATCACTTCTT TTGATGGAAC AATATTATCT 

40 

TGTTCTGATA ATGATCGTAA ATAATTCAAC 
ACAAATGCAT CATCJ^ACAAT TAATGGGAAC 

45 CTGATACGTA AAGCTACATA AAGTAATTCT 
TATAATTGAC CATTAACATG TTTAACCGTA 
TATCTGCCAT CTGTTAAATG CTTCAATATT 

50 CGTTTATCTT TAATTTGTTT AATGTGTTCA 
6CCCAATCTT TTGCGATATC ATTAAGTTGA 
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AAAGATCTAT TAAAAGAATA CGATGTTGAC 5220 

GAAGATAAAA TCTTAAACCC TGAAAAACTT 5280 

GGCATGAGCC AATAACACAA AACCGAGCGA 5340 

CGGTTTTTTC GCATTAAAAA TCGGACAGAT 5400 

GATATCTTTT AATTCTTAAT CGAGTGATTC 5460 

AAATATTTTA TCTGTAAACT GACCCTTGTC 5520 

CATTCTAGCA TCGATATTAT CGATATAGCA 5580 

TTTTGGAGAA CCATACTCTA ACTTACCATG 5640 

GATTTCTTCT CCTTCAATGT TCAATTCACG 5700 

CGAGATGTGT CCTAATAAGT TACCTTCGAC 5760 

TTCTCTAACT TTACCAATAT CATGCAAAAT 5820 

CAATGGATAA ATGTCaCAAA TTGATTTTGC 5880 

TAAGCCACTC GCAAAGTTAT GaTGATGAGA 5940 

TTGATATTTT TTCAATAAAT GACGTGTGAT 6000 

CAAATAATGA GAAATCTCTT CTTGTATTTC 6060 

TTGTTCTGTT TTTAATTGAT CTTCAGTTGT 6120 

TTTATTTCCG CGATAGTTTA TGATGTCACC 6180 

TGTTGCCATA TCATTTTTTG TAGCCGTCCA 6240 

TTGCAAATGT AATGTCATAT AATCTTTACC 6300 

CACTAAGAAA AAGTGATCAA CTGAA7CTCC 6360 

TTCCCGCCTT CCTCTATTTT GTTTAATGTA 6420 

TTTACACATG TAAAGTATAG TACTTGATAG 6480 

ATTTTTTCAG TACGTTTTTT ATCAAAATGA 6540 

GGATAATATG GTCTTAGTAC CTTAATTAAA 6600 

TTTGTAGATT GACTTAGTTC AACAGGATCA 6660 

ATTGAATCTT CATTATAGTT AATCATCGTA 6720 

TCTACCGCTT CATTAATAAC TTGAGGCAAA 6780 

TCAACTAAAC TTTGTAAATA ACTTAAACTT 6840 

TTTTTAAGAC TGTGATATTC ATGTCTTAAA 6900 
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10 



IS 



20 



25 



30 



GCTTGCATTT 


CAAGATATTG 


CTCATTATAT 


TCGTCAACTT GAGTAGCCAA TAAATGATCT 


7020 


TCTTCTTCAA 


GTTGTGCAGT 


TGTTTTTTCA 


CTTAAACTAG 


AACTTAATTC 


ATAAGAATAG 


7080 


TTTTGGTTCT 


CAAGATATTT 


AGTTAAATCA 


TTAAAACGAC 


TCAAATTACT 


AGTATAAGTT 


. 7140 


TGGTAATCTT 


CATGATGTTG 


GTAAAAATCT 


TCTTCAGTAC 


CAACATTGAT AAAATCGAAT 


7200 


AGTGCTGTAA 


TTTCTTTATT 


ATTTTCTTCT 


AATTGAGCAT 


TTAAATGATT 


TAATTCATTT 


7260 


GTAACAAGTT 


TGGTATTTTC 


AGCATTAATA 


CGCCATTTTT 


CATTCGTGTC 


TTCAGCTGAT 


7320 


TTCAACCATT 


GTtGCACATC 


GTGGAATAAA 


GATAATTTGT 


TGAAATAAAC 


AAATTGTGAT 


7380 


TTTGTAACAG 


CTTCAGCATG 


ATTGTAGAAT 


GTATCTAATT 


CTTGAACCAA 


TTGCTGGCGT 


7440 


TGTTGATTTA 


AATCACTGAT 


ATGTTGATCT 


AATGCTTTAA 


TATTCGCCAT 


TGTAGAAATA 


7500 


CTATCAACAA 


TTAAATCATT 


TGAAATTTTA 


GATGATAAGT ATAATTCATC 


CTTAACGTTC 


7560 


TCy\ACTGTCG 


ATTGTAATTC 


ATCATGACGC 


CCTTTOGCAT 


CATTTAAACG 


ACCTTCAATA 


7620 


TACTGACGTT 


TCTCTTCTAA 


AATATCTTTA 


TTTTTCAAAG 


CTTGTTGCCA 


GTGATCACGA 


7680 


ATGCGATATT 


GCTCATCAAG 


ATCAATUITCT 


AAGTCATAAT 


TTTCATCTAA AATGGCTAGT 


7740 


TGTGCTTTAA 


TTTCTTCGAT 


TTCATCTGTG 


ATGGCCTCGC 


TATAATCTAC 


TTCTTTTGAT 


7800 


TTAGACATGA 


TGATACCGAT 


AACAAATACT 


AAAGTTAATA 


CTGCGAAAAT AA7ACCAAAC 


7860 


AACATGTTGT 


TTGAAATAAA 


TGAGAAGGCA 


GTTAAACCAA 


TACCTACTAA 


TGTTAAAAGr 


7920 


ATAAACGTTG 


TTCGkAACAA 


TTTTTGACGT 


TTTTGttTTT 


CTT 




7963 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 3958 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
r (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 
ATATTGTCTT TACAATAGTT TGCTATGGAG GTAATTAACC AATAGGAGGA ATTTATAATG 60 
GCAGTAATTT CAATGAAACA ATTACTAGAA GCGGGTGTTC mCttCGGTCA CCAAACAOGT 120 

45 

CGTTGGAACC CAAAAATGAA AAAATATATC TTCACTGAGA GAAATGGTAT TTATATCATC 180 
GACTTACAAA AAACAGTGAA AAAAGTAGAC GAGGCATACA ACTTCTTGAA ACAAGTTTCA 240 
SO GAAGaTGGTG GACAAGTCTT ATTCGTAGGA nCTAAAAAAC AAGCACAAGA ATCAGTTAAA 300 
TCTGAAGCAG AACGTGCTGG TCAATTCTAC ATTAACCAAA GATGGTTAGG TGGATTATTA 360 
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GAAGATGGTT TATTCGAAGT ATTACCTAAA AAAGAAGTAG TAGAACTTAA AAAAGAATAC 480 

GACCGTTTAA TCAAATTCTT AGGCGGAATT CGTQATATGA AATCAATGCC TCAAGCATTA 540 

TTCGTAGTTG ACCCACGTAA AGAGCGTAAT GCAATTGCTG AAGCTCGTAA ATTAAATATT 600 

CCTATCGTAG GTATCGTTGA CACTAACTGT GATCCTGACG AAATTGACTA CGTTATCCCA 660 

GCAAACX3ACG ATGCTATCCG TGCGGTTAAA TTATTAACTG CTAAAATGGC AGATGCAATC 720 

TTAGAAGGTC AACAAGGCGT TTCTAATGAA GAAGTAGCTG CAGAACAAAA CATCGATTTA 780 

GATGAAAAAG AAAAATCAGA AGAAACAGAA GCAACTGAAG AATAATCAAC TGTTGAATCT 840 

GACTTAGATA TAGTTTAAAT GGGTGATAAG ATATTAATGC TTATCACCTT TTTTAAAAAG 900 

AAAATCGAGG CAAATTACAA ATATTCAATT AGAGTATTGG CAATCTTGCC TATAATAATG 960 

CTAAAATCAT AATATATAAn ATGATAACTT ATTGGAGGAA TAATGAATGG CAACTATTTC X020 

2^ AGCAAAACTT GTTAAAGAAT TACGTGAAAA AACTGGCGCG GGTATGATGG ATT6TAAAAA 1080 

AGOGCTAACT GAAACTGATG GTGACATCGA TAAAGCGATT GACTACCTAC GTGAAAAAGG 1140 

TATTGCTAAA GCAGCTAATU^ AAGCAGACCX5 TATTGCGGCT GAAGGTTTAG TACATGTAGA 1200 

25 AACTAAAGGT AACGACGCAt TATCGTTGAA ATCAACTCTG AAACAGACTT TGTTGCTCGT 1260 

AACGAAGGTT TCCAA6AGTT AGTTAAAGAA ATCGCTAATC AAGTATTAGA TACAAAAGCT 1320 

GAAACTGTTG AAGCTTTAAT GGAAACAACT TTACCAAATG GTAAATCAGT TGATGAAAGA 13 80 

ATTAAAGAAG CAATTTCAAC AATCGGTGAA AAATTAAGTG TTCGTCGTTT TGCTATCAGA 1440 

ACTAAAACTG ATAACGATGC TTTCGGCGCT TACTTACACA TGGGTGGACG CATTGGTGTA 1500 

TTAACAGTTG TTGAAGGTTC AACTGACGAA GAAGCAGCAA GAGACGTTGC TATGCATATC 1560 

GCTGCAATCA ACCCTAAATA TGTTTCTTCT GAACAAGTTA GCGAAGAAGA AATCAACCAC 1620 

GAAA^^GAAG TTTTAAAACA ACAAGCATTA AAT6AAGGTA AACCAGAAAA CATCGTTGAA 1680 

AAAATGGTGG AAGGACGTTT ACGTAAATAC TTACAAGAAA TTTGTGCTGT AGATCAAGmT 1740 

TCGTTAAAAA CCCTGATGTA AGAGTTGAAG CTTTCTTAAA AACAAAAGGT GGAAAACTTG 1800 

TTGACTTCGT ACGCTATGAA GTAGGCGAAG GTATGGAAAA ACGCGAAGAA AACTTTGCGG 1860 

45 ATGAAGTTAA AGGACAAATG AAATAATCTG TCATAAAGTA AAACAAGGAA GAA6ACACCT 1920 

TTAATGTTGC TTTATTAAAA TGTAAATCAT TCTAATAAAA CGACAACTGT GTCTTCTTTA 1980 

CTTGTATATG TTACATATAT TCACGATAGA GAGGATAAGA AAATGGCTCA AATTTCTAAA 2040 

SO TATAAAC6TG TAGTTTTGAA ACTAAGTGGT GAAGCGTTAG CTGGAGAAAA AGGATTTGGC 2100 

ATAAATCCAG TAATTATTAA AAGTGTTGCT GAGCAAGTGG CTGAAGTTGC TAAAATGGAC 2160 
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TTAGGTATGG ACCGTGGAAC TGCTGATTAC ATGGGTATGC TTGCAACTGT AATGAATGCC 2280 

TTAGCATTAC AAGATAGTTT AGAACAATTG GATTGTGATA CACGAGTATT AACATCTATT 2340 

GAAATGAAGC AAGTGGCTGA ACCTTATATT CGTCGTCGTG CAATTAGACA CTTAGAAAAG 2400 

AAACGCGTAG TTATTTTTGC TGCAGGTATT GGAAACCCAT ACTTCTCTAC AGATACTACA 2460 

GCGGCATTAC GTGCTGCAGA AGTTGAAGCA GATGTTATTT TAATGGGCAA AAATAATGTA 2520 

GATGGTGTAT ATTCTGCAGA TCCTAAAGTA AACAAAGATG CGGTAAAATA TGAACATTTA 2580 

ACGCATATTC AAATGCTTCA AGAAGGTTTA CAAGTAATGG ATTCAACAGC ATCCTCATTC 2640 

TGTATGGATA ATAACATTCC GTTAACTGTT TTCTCTATTA TGGAAGAAGG AAATATTAAA 2700 

CGTGCT6TTA TGGGTGAAAA GATAGGTACG TTAATTACAA AATAAATTTA GAGGTGTAAA 2760 

ATAATGAGTG ACATTATTAA TGAAACTAAA TCAAGAATGC AAAAATCAAT CGAAAGCTTA 2820 

TCACGTGAAT TAGCTAACAT CAGTGCAGGA AGAGCTAATT CAAATTTATT AAACGGCGTA 2B80 

ACAGTTGATT ACTATGGTGC ACCAACACCT GTACAACAAT TAGCAAGCAT CAATGTTCCA 2940 

GAAGCACGTT TACTTGTTAT TTCTCCATAC GACAAAACTT CTGTAGCTGA CATCGAAAAA 3000 

25 GCGATAATAG CAGCTAACTT AGGTGTTAAC CCAACAAGTG ATGGTGAAGT GATACGTATT 3060 

GCTGTACCTG CCTTAACAGA AGAACGTAGA AAAGAGCGCG TTAAAGATGT TAAGAAAATT 3120 

GGTGAAGAAG CTAAAGTATC TGTTCGAAAT ATTCGTCGTG ATATGAATGA TCAGTTGAAA 3180 

^ AAAGATGAAA AAAATGGCGA CATTACTGAA GATGAGTTGA GAAGTGGCAC TGAAGATGTT 3240 

CAGAAAGCAA CAGACAATTC AATAAAAGAA ATTGATCAAA TGATTGCTGA TAAAGAAAAA 3300 

GATATTATGT CAGTATAAAA CTAATATACA ATGACATATT AAAATGCCAG TATTAAACGA 3360 

TAATGTAACA TTTAAAATGG GCATGTTTAA TTAAATCAAA GATGCATGTG ATAATTTAAA 3420 

TTCAQAATGA GCATAAAAAT GGTGTTTAAA CAAGTTAATT AAACATATAC TTTATAAATA 3480 

ATAGGCATTA GGTATATTGC TATAATAAAG TTATGTAATT TTTAACCTCA GTATGTATGT 3540 

CACATTTCTG GTGTAAACTG TACCGAGTCA GACTTTGGTA CAGTTTTTTT ATTTGCTTAT 3600 

TCAATGCATT AAATGAGTAT GATAAAATGA TAATGATTGT TTAGTAACTT ATACTATAT6 3660 

ACAGAGATGA TCAGGCTCGG AGGAAAGACC ATGTTTAAAA AGCTAATAAA TAAAAAGAAC 3720 

ACTATAAATA ATTATAATGA AGAATTAGAC TCGTCTAATA TACCTGAACA TATCGCTATT 3780 

ATTATGGATG GTAATGGGCG ATGGGCTAAG AAGCGAAAAA TGCCTAGAAT TAAAGGTCAT 3840 

SO TACGAAGtAT GCAAACAATA AAAAAAATTA CTAGGGTAGC TAGTGATATT GGTGTTAAGT 3900 

ACTTAACTTT ATACGCCTTT TCCACTGAAA ATTGGTCAAG ACCTGAAAGT GAAGTAAA 3958 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5333 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doioble 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

10 

ATTAAAACAA CTTAATATAC CTATTTATGG TGGTCCTTTA GCATTAGGTT TAATCCGTAA 60 

TAAACTTGAA GAACATCATT TATTACGTAC TGCTAAACTA AATGAAATCA ATGAGGACAG 120 

TGTGATTAAA TCTAAGCACT TTACGATTTC TTTCTACTTA ACTACACATA GTATTCCTGA 180 

IS 

AACTTATGGC GTCATCGTAG ATACACCTGA AGGAAAAGTA GTTCATACCG GTGACTTTAA 240 

ATTTGATTTT ACACCTGTAG GCAAACCAGC AAACATTGCT AAAATGGCTC AATTAGGCGA 300 

20 AGAAGGCGTT CTATGTTTAC TTTCAGACTC AACAAATTCA CTTGTGCCTG ATTTTACTTT 360 

AAGCGAACGT GAAGTTGGTC AAAACGTAGA TAAGATCTTC CGTAATTGTA AAGGTCGTAT 420 

TATATTTGCT ACCTTCGCTT CTAATATTTA CCGAGTTCAA CAAGCAGTTG AAGCTGCTAT 480 

25 CAAAAATAAC CGTAAAATTG TTACGTTCGG TCGTTCGATG GAAAACAATA TTAAAATAGG 540 

TATGGAACTT GGTTATATTA AAGCACCACC TGAAACATTT ATTGAACCTA ATAAAATTAA 600 

TACCGTACCG AAGCATGAGT TATTGATACT ATGTACTGGT TCACAAGGTG AACCAATGGC 660 

on 

AGCATTATCT AGAATTGCTA ATGGTACTCA TAAGCAAATT AAAATTATAC CTGAAGATAC 720 

CGTTGTATTT AGTTCATCAC CTATCCCAGG TAATACAAAA AGTATTAACA GAACTATTAA 780 

TTCCTTGTAT AAAGCTGGTG CAGATGTTAT CCATAGCAAG ATTTCTAACA TCCATACTTC 840 

35 

AGGGCATGGT TCTCAAGGTG ATCAACAATT AATGCTTCGA TTAATCAAGC CGAAATATTT 900 

CTTAeCTATT CATGGTGAAT ACCGTAT6TT AAAAGCACAT GGTGAGACTG GTGTTGAATG 960 

CGGCGTTGAA GAAGATAATG TCTTCATCTT TGATATTGGA GATGTCTTAG CTTTAACACA 1020 

40 

CGATTCAGCA CGTAAAGCTG GTCGCATTCC ATCTGGTAAT GTACTTGTTG ATGGTAGTGG 1080 

TATCGGTGAT ATCGGTAATG TTGTAATAAG AGACCGTAAG CTATTATCTG AAGAAGGTTT 1140 

4g AGTTATCGTT GTTGTTAGTA TTGaTTTTAA TACAAATAAA TTACTTTCTG GTCCAGACAT 1200 

TATTTCTCGA GGATTTGTAT ATATGAGGGA ATCAGGTCAA TTAATTTATG ATGCACAACG 1260 

CAAAATCAAA ACTGATGTTA TTAGTAAGTT AAATCAAAAT AAAGATATTC AATGGCATCA 1320 

^ GATTAAATCT TCTATCATTG AAACATTACA ACCTTATTTA TTTGAAAAAA CAGCTAGAAA 1380 

ACCAATGATT TTACCAGTCA TTATGAAGGT AAACGAACAA AAAGAATCAA ACAATAAATA 1440 
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GCTTTTTCTT TATATATGAT GAGCTTGAGA CATAAATCAA TGTTCAATGC TCTACAAAGT 1560 

TATATTGGCA 6TAGTTGACT GAACGAAAAT GCGCTTGTAA CAAGCTTTTT TCAATTCTAG 1620 

5 TCAGGGGCCC CAACATAGAG AATTTCGAAA AGAAATTCTA CAGGCAATGC GAGTTGGGGT 1680 

GTGGGCCCCA ACAAAGAGAA ATTGGATTCC CAATTTCTAC AGACAATGTA AGTTGGGGTG 1740 

GGACGACGAA ATAAATTTTG AGAAAATATC ATTTCTGTCC CACTCCCGAT TATCTCGTCG IB 00 

10 

CAATAl-rriT TTCAAAGCGA TTTAAATCAT TATCATGTCC AATCATGATT AAAATATCAC 1860 

CTATTTCTAA ATTAATATTT GGATTTGGTG AAATGATGAA CTCTTTGCCT CX3TTTAATTG 1920 

CAATAATGTT AATTCCATAT TGTGCTCTTA TATCTAAATC AATGATAGAC TGCCCCGCCA 1980 

IS 

TCTTTTCAGT TGCTTTCAAT TCTACAATAG AATGCTCGTC TGCCAACTCA AQATAATCAA 2040 

GTACACTTGC ACTCGCAACA TTATGCGCAA TACGTCTACC CATATCACGC TCAGGGTGCA 2100 

2^ CAACCX3TATC TGCTCCAATT TTATTTAAAA TCTTTGCATC ATAATCATTT TGTGCTTTAG 2160 

CAGTTACTTT TTTTACACCT AACTCTTTTA AAATTAAAGT CGTCAACGTA CTTGATTGAA 2220 

TATTTTCACC AATTGCCACA ATGACATGAT CAAAGTTACG GATACCTAAA CTTTTCATAA 2280 

25 CTGCTTCATC TGTAGTGTCT GCAACAACCG CATGAGTAGC QATATCACTA TATTCATTCA 2340 

CTCTATTTTC ATCATGGTCG ATGGCCATTA CATCCATGTC TAATGCATTC AACTCACGAA 2400 

CX;ATACTACC TCCAAAACGA CCTAGACCGA TGACTACATA TTCTTTACCC ATACTCGCCC 2460 

^ TCCATTAAAT GATTTTCATC AATTCATTGA AAATATAAAT TTAAAATTAT TATAAATGAG 2520 

TACCCCAACT AAATTATCTA AATGCAGTAA TGCAAGTAAA TGAAAGTTGG GGTATCGTCT 2580 

CAACTTATGA TTTCTTTCCT TCAACATATT CTTTGTCGAA AACAAATAAT CTTAATAATA 264 0 

35 

ATATTAACGA TGGAAGTAAT AAAAGTAAAC CTAAAATAAA GACAATCACT AATGTCCAGC 2700 

CCA'CTTCTGG ATTAACATAT GCATCTGTAA TTTTT A CAAA CGGATATAAA AGGTATGGCA 2760 

ATTTACTAAT TCCATAGCCA AAGAACGCGA ACATCATTTG TAAAATAACA AATACAAAAG 2820 

40 

CCAAACCATG TTTTTTCTTA AAGAATGTTA ACAATGAAGC TAATGCAAAG AATAAGAAAC 2880 

TTATACCAAA CATCCACCAA TAGTCAAAAA CAGCTGAATA AAAATGTTCA 6AATTTTGAA 2940 

^ TGCGTAATGA TAGAAATACG AATAAACAAA TGATAATCAT CGGCGGCCCT AAAAATATGT 3000 

GCCATTGTCT TGTTAAATTA TATGCTGGTT CGTCATTTGC TTTTTTAGCA TAATATGTCA 3060 

AAAATCCTGA TGAAATATAT AAAACTGAAA TAATTGCCAA GAATACTACA GACCAAGCAA 3120 

SO ATGGGCTTAA TAATAACTGC ACCCAATCTA GATCX3ATAAC ATTGTTTCGA ACATTAATAT 3180 

AGCCACCrrC TGTAATAGTT AAAGCAGTAG ATAATGAAGC TGGAATTAAT AATCCACTTA 3240 
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AACTGTTTCT CAAOGATATC ATAATCAGTQ CTATTGAACC TGGTATTAAC AATACCGTGC 3360 

CTAAATATTT GATTGACTCT GGAAAGAAAC CTAC6AATCC TACGAAGAAG AAAACAAAGA 3420 

5 ATACATTCGT AACTTCCCAA ACTGGGTTTA AATAACGTGA AATTAAGTGA TTAATTTTCT 3480 

TTTCATCACC AGTTAACTTT GAATGCAATG CGAAGAAACC TGCCCCAAAA TCTATAGAAG 3540 

CAATAATGAT ATAGCAAAAT AAAAACAACC ATAACACTGT TATACCTATA AATGCATAAA 3600 

10 

TCATTTTTCT ATTTCTCCTC CTTGCTTCTT GGCTAAACGA TTTACATCTT CATACGCCGG 3660 

TTTATTTTTA AACATACGAA TTAATACGTA TGCACATGTA TACATTAAAA TGATGTACAA 3720 

TATGCCAAAT AAAATTGTAA CGAaGGTTAT TCCGCCTGCT TGTGTTGCTG CTTCTGCCAC 3780 

IS 

GCGCATATAA CCACGAACAA TCCAAGGCTG TCTACCCATC TCTGTTAAGA ACCATCCAAA 3840 

TTCTATAGCT AGCATTGAAG CTGGGCCTGT TAATAATATT CCATAAAGCA TCCATTTATG 3900 

AGTAGAAAAC TTTCTAAGCT TITrAAACAT TAAAGTTAAG ACATAAACAC CTGAAATGAC 3960 

AAAACATAAA ATTCCCATCG TTACCATTAA ATCAAAGAAA TAATGGACGA TCATAGGCGG 4020 

ATGTAAACTT TTTGGAAAAT CATTTAACCC TTGTACTTTA GTTTTGACAC TATTATCTGC 4080 

25 TAAGAAACTC AATAGTCCAG GTAATTCAAT CGCACCTTTA ACTTGCTGAG T C TTTTCATC 4140 

TAACACACCA AATAATAATA ATTTGGCATG GGAAGATGTA TCX3AAATGCC ATTCATAAGC 4200 

TGCTAATTTT TCAGGTTGGA ATTTATGCAA AAATTTTGCA GATAAATCCC CTGCCAACAT 4260 

30 AGAAAGTAAT GTTGAAAAGA ATCCAACTAT CATAGACATT TTCAAAGCTT TCTTATGGTA 4320 

GACAGTATCT TTAGGTTGAC GATTACGCAA TAATTTAAAA GCTGCTATTG ATGCAATAAC 43 80 

AAATGCCATC GTCATACCGG CTGTAGTAAT TACGTGAAAT GATCGAACTA TAAACGATGG 4440 

35 

GTTAAACATC GCTTCTATAG GTTGAACATT GACCATCTTT CCATTCTTCA ACTCAAAACC 4 500 

TGCASgCGTA TTCATAAATG AATTCACTGA AGTAATGAAG AATGCTGAGA AAGAGCCACC 4560 

AATAATTACT GGTATACTAA TTAAGAAATG TGTCCATTTA TTTTTAAAAC GATCCCAAGT 4620 

40 

ATATAAATAT ATACTTAAGA AAATAGCTTC AAAGAAGAAC GCAAATGTTT CCATAAATAA 4680 

TGGAAGTGCA ATAACGTGTC CACCCATTTC CATAAATGTA GGCCAAATCA ATGATAATTG 4740 

^ AAGTCCTATA ATTGTACCTG TAACAACTCC CACTGCTACA GTAATTGTAT AAGCTTTAGC 4800 

CCATCTTTTG GCCATAGCTA TATATTGAAG ATCATTTTTG CGAATACCTA AAAATTCTGC 4860 

AATTGCGAAC ATTAAAGGCA TACCAACACC AATCGTTGCA AAAATGATAT GAACTGCTAA 4920 

SO AGTCATAGCT GTCAAAAACC GACTGATTTC AACTGTATCC ATTTAAAAAC ATCACCTTTT 4980 

TCTTTTTTTG ATGACAACAC AATGAACTTA ATTATAATTG CTATAATGTG TATTTTTAAA 5040 
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GAATTTCAAT GTATAATTGT GTATATTACA TTAGAATAAA GCACGAAGGA GCATGATACA 5160 

TGTCAGAAAT AATCGTTTAT AC6CAGAATG ATTGTCCACC TTGTACATTT GTAAAAAATT 5220 

5 ATCTAAATGA GCATCACATT GATTTTGAAG AGAGAAATAT CAACAATCAA CAATATCGAA 5280 

ACGAAATGAT AGATTTTGAT GCTTTTTCAA CTCCGTTTAT TTTGTTGAAT GGC 5333 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

75 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 
ATACGTGACC CTTTATCCGA AAATTTCTTT TCATATTCTG TTAAAATATT ACTGCCATCG 60 
TCTTCTTGAT GTAAATTTAG ATTTATTTTT GTAAAATACA TTCCAAATTG AGACATACTT 120 
TCTAAACTGT AGGCAAATAG TCCTCTGTTA TCAGTTTTAA AATGTAAATC TCCTTCATCA 180 

2s TTTAAGATTT GTTGATACAA CGCTAAAAAC GTATGATACG TTAAACGTCG TTTTGCATGA 240 
CGATTTTTTG GCCATGGATC TGAAAAGTTC AAATAAATAC GCGAAACTTC GCCGTCTTTA 300 
AAATATTCAT TTAATTCAAT GGCGTCATTA CAAATAATCT TTAAATTTGT TAAACCCATC 360 

30 TCTTTAACTT TATCCAATAC TTTATAAACG ATACTTTTCT CACGTTCCAT TGAAATATAG 420 
TTAATATGAG GATTTTGAGC AGCTAATGTT GTAATAAACT GCCCCATACC CGAACCAATT 480 

TCAATGTGTA TCGGTTGCGT TTTaTCAAAC CATTCAGTCA TTTTCCCTGc ATGTTGACCG 540 

3S 

TCCATGTCAA CCAATTCAGG ATGATCTTTT AAATAATCTT CAGCCCATGG TTTGTATCGA 600 

ACTCTCATAT TTTATTGTCC TCTTAAATAA ACATGTTACT ATTCATAACT TCATTTAGGA 660 

ATTTAAGCCA AGTGTTCATA TCCTTATATC TTTTTTGCTC TTCATACCAT TGAACAAGAC 720 

40 

CTATAGATTG AATTACCGTA TACCATTTCA TACGTTTATT TAAATTCAAG CTCTCTTGAA 780 

CACCATATGT TTCAAGCCAT TCAGACCATT GTTGTTGTGO AACATA6TTG TAAAGCAGCA 840 

TTCCGATATC AATTGCCGGG TCTGCAATCA TTGCACCTTC CCAATCAACT AAAAATAGTT 900 

45 

CATCTCGATC GGATAATAAC CAATTATTAT GATTCACATC ACCATGTACA ACAGTGAAAA 960 

AACGCGAATC TAAACTCGGT ATATGCTCTT CTAAATAGGT TAATGATTTT CTCACAATAT 1020 

SO GATGTGTTAA AACTTCTCTT GATAAAGAGG CATTAATTTT ATTAAGCATA ATCTCAGGAG 1080 

TAATAGGTTC CATTTCCATA CGCTTTAACA TACTTAATAA AGGTCTAGAA TTGTGTATCT 1140 
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TTTTCCAATG TTGTGCTGTA ACAACCTCGC CTGTTTCTAT GCGTTTCGTC CATACTAATT 1260 

TGGGCACAAT ACCTTCTGCT GATAATGCCG CAATAAATGG ATTTGAATTT CGTTTTAAAA 1320 

ACAACTTTTG TCCATCTTGT TCAGCCATAT ATGCTTCACC AGATGCACCA CCTGCTGAAT 1380 

CAAGTGTCCA CCCTAATTGA TAAAACTGCT CCAACTCGTC CACCTCACTT TCAATTAGAA 144 0 

AATGGCTCTA GAAATAGGTT TTTCAAGAGC CATATATTCT AATTTATAAC ACCATACTGG 1500 

TACAAATATT ATGTCCAGAT AATTATTGTA AATCCTCAAC CAATGCCTAC ATTACACGAC 1560 

TAAATTTAAA TCGTAATGTC TGTCATTGAC ACCATACATT CTATAGTCAC TTACTTGACA 1620 

TATAATGTTA CCGTGTCTAA AACTACATGT TTTTGAATCT CTGTAGGCGA TAAACTcTAG 1680 

TTTTCAAAAT AATTGCTATC CXATTTTCAT GGTTAGCATA AATTTATGAA CTGTAACATT 1740 

TACGTACTTA ^GTAAAATATG AT6CACATCA TATTTGTrAC TCATAGAAAA TTTTATAAtT 1800 

TTTATCATTA TATTTCAACT GAAAATGAGA AACAAAATGG CACTTTTTAC TAATATGTGT 1860 

TTTCTAAACA ACACTTTTAA GCTTCGTTTT AAATTATAAC ATAATTCACT TACGAAAGTT 1920 

GATAAATTTA AGTAATTTAA TCTAAAAATA TGATGAAAGA ATTTTAAATA CTGTGTGACT 1980 

25 CTATATACTT TTCAAATCCT TCTTGTAGTT GACGTGTAAT TGG6CCAACT TTACCATCAT 2040 

TAACTGGTTC ACCATCTAAT TTAATAACAG GTGTAACCTC AGCTGAAGTA CTTGAAACAA 2100 

TAACTTCATC TGCGTTTTTC AAGAAATCTA CAGTAAACGT TTCTTCTTTA AATGGGATGT 2160 

TATAGTCTTC GGCAATTTTT TTAATTACAA TTCGTGTAAT ACCATTAAGA ATATAGTTGT 2220 

TAATCGGATG TGTATAAATC ACACCGTCTT TAATTGCATA AGCATTACTT GAAGATCCTT 2280 

CAGTTACAGT TtCACCTCGA TGTTGAATTG CTTCAACTGC ATTATATTTC ACAGCATATT 2340 

CTTTTGCTAA TACATTcTCC TAATAAGTTC AAGCTTTTAA TGTCGCAACG TAACCATCGG 2400 

ATATCTTCAA CGGTAACACC ATTCACACCA TTTTCTAAAT GATCATAAGG ACGATCAT7A 2460 

CTCTTTGTAT AAGCAACAAT TGCTGGTTCT ACTTCAGGTG TCGGGAAGCT ATGATTCCTT 2520 

TCAGCTACAC CACGCGTTGC TyGAATATAA ATTGCCCCAG TTTCAATTTG ATTCATATCA 2580 

ACTAATTTAC GAGATAGTTC AATTAATTCT TCTACAGAAT AATTTAAATC TAAACCAATC 2640 

4S TCATTGGCAC TACGTwAAAw TCTTTCATAA TGTTCTGTTA CTGTAAATAA CTTACCATTA 2700 

TATACTCGAA TGTATTCATA AATACCATCG CCAAATACGT ATCCTCTGTC GTTGTATGAA 2760 

ACCTTTGCTT CACTTGGACT TACAAACTCA CCATTTAAAA AAATTTTTTC CATATATTAT 2820 

TCCTCCACGC ATAATGAATA AATTGCTTCT AAGTAAATAC TAGTTGCGTT AAATAACTGT 2B80 

TTTTTAGTGA TATATTCATT TTTCTGATGC ATTAAATCTT CAGAATCACT AAACATTGCG 2940 
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TCAGTCATAT CATTTGTTTG ATTTCTATAT GCAGTAACTA ACTTTTGTAC AAAAGGATCA 3060 

TTTTTATCAA CATAATGTGG TGGTTGGACT TTACCTAATT TCACTTCAAA GCCATATTGT 3120 

TGAATCTCAT TTGCAAAACG ATCCATAGCT TTTTCAAATT CAAATCCTTC TGGGTAGC6T 3180 

AAGTTGATAC CGAAAAGACC TGCGTTTTCA TTATCATATG TAATAACACC AATGTTAGTT 324 0 

GTCACGTCAC CCATGACATC TGTATGGAAT TTCATTCCCA TCTTTTCACC AAAATCTGAA 3300 

TTAAATAAGT AGCGATTACT AAATGCTACA AACGCTTGTG CATTATTATC AAGATTTAAT 3360 

GATGCTAAGA ATTTTAGTAA GTAAAGACCC GCATTCACAC CGATAGATGG ATCCATACCA 3420 

TGAACCGCTT TACCTTCAAC TGTTAAAACT AGAATGCCAC TATCAACAGT ACTATCACCT 34 BO 

TGTAAATGAT TTTGTTCTAA AAAGTACTCA AAGTCTTGAA TAACATCTGT CATATTTTCT 3540 

TTAACAAGCA CTCTTGCTTC TGCATGATCA GGTACCATGT T6TAACGTTC ACCAGATTTA 3600 

AAAGTTATTA ATTCATAATC AGGTTCATCT TGATCTTCAG TAAGTTTATT TTGAACTAAA 3660 

TCAAATGTTG TAATGCCTTT TTCACCATGA ATACATGGAA ATTCTGCATC TGGTGCAAAA 3720 

CCTAATGTTG GCATTTCTTC TGTTTTAAAA TAGCGATCCG TACATTTCCA ATCAGATTCT 3780 

25 TCATCC6TAC CAATAATCAT ATGAATACGT TTCTTCCAAT CCACATTCAT ATCTTCTAAT 3840 

ATCTTAATTG CATAATAAGC AGCAATTGTT GGACCTTTGT CATCAAGTGT ACCTCTAGCT 3900 

ATGATAGCAT CTTCTGTTAC AACCGGCTCG AACGGATTAC TATCCCATCC ATCACCAGCA 3960 

GGAACAACGT CAACATGACA TAAGATACCT AATACGTCAT TTCCTTTACC TGCCTCAATT 4020 

CTTCCTGCAA TATGATCCAC ATCATGTGTT GTAAATCCAT CTCTATGTGC AATTTCATAC 4080 

ATGTAGTCTA ATGCCTTACG AGGACCTGGA CCAACTGGTG CGTCTTCTGA TGCTTTTGCA 4140 

TCATCTCTCA CACTTTCAAT TGCTAATAAT CCTTTTAAGT CATTAATGAT TTGATCTTCG 4200 

TATTGTTGAA CTTTTTCTTT CCACATTCGA AATCGACTTC CTTTTTTCTA TAAGTTAAAT 4260 

TCTATTTTAC ATGAAAAGAT ATAAAAACTA CAATAAGATG TCAGAAAATA ATAAAAAGGA 4320 

ACAAAACGAT GCTATTGATA TGACACAAAT CATAAATAGC TGCTTTGTTC CTTTTTTAAT 4380 

TTATATATTT AAAATACACA TATTCAAGAG CTCGAGATAT AAGTCAATGT ACTAGGCACA 4440 

45 CAATTTAATA TTGACAGTAA TTAACCGAAC GAAAATGCGC CCCGGGGCCC CAACATAGAG 4500 

AATTTCGAAA AGAAATTCTA CAGACAATGC AAGTTGGCGG GGCCCCAACA TAGAAGCTGG 4560 

CCAATAGTTA GCTTTCAATA ATGTGCAAGT TGGGGTAAGG GCCCCAACAC AGAAGCTGGC 4620 

SO CAATAGTCAG CTTTCAATAA TGTGCAAGTT GGGGTAAGGG CCCCAACACA GAGAATTTCG 4680 

AAAAGAAATT CTACAGACAA TGCAAGTTGG CGGGGCCCCA ACACAGAAGC TGGCCAATAG 4740 
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TAAAGAAATA OGTTTTCTTT AGATATTAGT ATTTCTTATG AATGAGTTTC ACGCATGTAT 4860 

TCTTCTTTCT ATATGCATAT TAGCTATGAC TAACGATAAA GAACCTGAAA CACTAATAAA 4920 

TGTCCTATAG TTTACAATAT TATATTGGCA GTAGTTGACT GAATGAAAAT ACGCTTGTAA 4 980 

CAAGCTTTTT TCAATTCTAG TCAACCTTGC CGGGGTGGGA CGACGAAATA AATTTTGCTA 5040 

AAATATGATT TCTGTCCCAC TCCCTTATCA TTTCTGTCCT ACTCACATCT TATTCTTTAT 5100 

CAGATAATGC ATTTTTATTC TTTTTTAAAT CTTCTTCAGT GACGATACGT AAATTATTAT 5160 

TTGGTGTGCG CCACCTTCAT CATCAAATTT ACCTTTTTCA ATACTTTCGT CAGTCTTATT 5220 

GTCATATTCG GTAAATTTTG ATTTTTCTTC TTTGAAAAAT GCTTTTGGAT TATTTTTTAA 5280 

TCTATTAGCA TATTCTTTCG GATTTGTTTT TACTTCTTTA ATTGTTTCAT TAGCAATTOT 5340 

TCCTAATTGC GTCGCTTTAT CCTTAGCATT ATCTTTATAG CTTTGAGGAT CTTGTTTATA 5400 

TTTATTATAT TCcTGcTTTC AGCTTGTCAC GACTATCTTT ACGTGTAACA AGTACAGCTG 5460 

CTACAGCGCC ACCTATACCT AAAATCGCTT TAAATAAATT ACCTTTTGCC ATATCAATCG 5520 

TCTCCCTTTT ATTTATAATT TAATTTGTCA AAATCATTTT CAGTTAATAA ACGATATTCT 5580 

25 CCTGAATCTA AATTGCTGTC CAATTCTAAA TCAGCAATTT TGATACGTCT TAAATGTAAT 5640 

ACCTCATTTT GAATGCTATG AAACATTCGT TTAACTTGAT 6ATATTTTCC TTCATAAATT 5700 

GTTACX3TGTG ACGTTTGATT ATCAATATAA GTTAATATTG CAGGCTTAAC CTTGCCATCA 5760 

^ GTCAGTGTtA CACCCTCTTT AAAAGCTTGA ATGTCGTCTT CAGTGATAGG ATTTGCTGAA 5820 

ATAACTTCAT ATTTTTTAGA AACATGTTTG TTTGGACTCA TTAATTCATG ATTAAAATCA 58 80 

CCATCATTCG TTATCAATAA AAGCCCTTCT GTATCTTTAT CAAGACGACC AACCGGAAAA 594 0 

ATATTTAGAT GTTGGTATTC AGGTATTAAA TCAATAACGG TTTTTGAATG ATGATCTTCA 6000 

GTTGCTQATA TATAACCTTT TGGCTTATTT AACATAATAT AGACATTTTC AATGTATTCT 6060 

ATTAATTCTC CACGAACTGT TATCTTATCG TTTTCTGGTT CTATAT6TGT TTTTGGTGAT 6120 

TTAATTACTT GTTCGTTGAC ATTTACAAGG CCTTTTTTAA GTAACTGTTT GACCTCATTA 6180 

CX3TGTACCGA CGCCCATATT TGCTAAAAAT TTATCTATTC TCATCGTAAA AACCTAACTC 624 0 

TAOGTCTTAA TTTTTCAGGA ATTTCACCTA AGAATTCGTC CGCAAGACGC GTTTTAATTG 6300 

TGATTGTACC GTAAATTAGA ATACCTACTG TAACACCTAA AATAATAATG ATTAAGTAAC 6360 

CAAGTTTAGT AGGTTCTAAG AATAGATTTG CAAGGAAAAA TACTAATTCT ACACCTAGCA 6420 

50 TCATAATAAA TGAATACAAG AATATTTTTG CAAAATGAAT CCAACTATAG CTGAATTTAA 6480 

ACTTCGCATA TTTTTTAAGA ATATAGAAAT TACATCCAAT TGCAAATAAT AATGCGATAC 6540 
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ACTTQATAAC TACAGAAGCT AAAATAACAT 
GTAACATTGA TGCCX3TTACA CTTAATAGTG 
^ ATAATAAGCG ACTACCATCA TGGTTAGGGT 

AGAAAACTGT GAATAATGGT TGTGCCAAGG 
TAAACATTAA TACACCAATA GATGTTCTAA 

10 

CTGCAAATGT TTTTGTAATA TAAGGAATTA 
TCGGAATCAT TACAATTTTA TTAGTTGACA 
ACTGTGAAGG TATACCAACT AAAGATAAAG 

IS 

TAAATAATGG ATAATTCAAA CTTACAATAA 
TATACATCTT GCCATATGAC ACATCTATAT 

2^ TATTATGCTT ACGCTTTCTC CAGTAATACC 
CTGCTGCTGC AAAAGTAGCA ATACCATTGG 
GTACTAAATA ACTTCCGATT AATATGAAAA 

2S ACACTGCTGT TGGCCCCATA GATTTATAAC 
CAGGAATAAA GATAACAACC ATACTAATGA 
ACCAACCGTT TTTATCATGA ATGTTTCTAG 
AGAAATACAG TACCAAGAAA CCTAAAACAC 
TATAAAATTT CTGACTTACT TTATATGCCC 
AAGCTGCTAA TGGTACACCT GCTGTCGCAA 

35 

CGTATGTGAA CGGCGCCATA TTTTCTTGTC 
AAAGTACGCC CAATACCTTG GTAATTAATA 
CCATTTCTTT ACTTTCACTC ATTACGAATC 

40 

AACTAAAAGC TGTTTCTCTG TAAAATCATT 
TATTTCATTG TCGTATATTC AATGAATTAT 

^ AATACTAGAT TTTGATTAGA ATATTACGAA 
GATGAACATC GCATAACAGT AGAAAAATCA 
TATGAAGTTC ACATTATAAA TATATTCAAC 

SO AGCAAAGTTT AAAAAAGTAC TATAAAATAG 
AAGCGTATAT GTATCAAACA ATTATTATCG 

SS 



AAACTGTTAA TTTCTGTTTA TCTATACCTT 6660 

AAATTAGTAT TGCTACAGGC GCATAATAGA 6720 

CATGACCTAA AACAATTGGA TCGTAACCAT 6780 

CCATAATTCC AATACTAGCT GGAACAGTTA 6840 

TTTGATGATG CATTTCATGT AAGCGACCTT 6900 

AACTCACTGC AAAACCAGCA CTTAATGATG 6960 

TATTTAGCAT ATTAAAGAAT ATATCTTGTA 7020 

CACCGTTATG TGTAAATTGA TCTACTAAGT 7080 

CGAACOGTAT ACTATAAGCA ATAATTTCTT 7140 

CTGTGTAATC AGATTCGACC ATACX5ATCAA 7200 

AGAGTGTGaA TATACCAATA ATCGCACCAA 7260 

CTAATAAAAT AGAGCCATCA AAGACATTTA 7320 

TCACGCGTGC AATTTGCTCA GTTACTTCTG 7380 

CTTGGAATAT CCCTCTCCAT GTCGCTAATA 7440 

TTCTTATAAT CCAAGTAATA TCATCGACTG 7500 

CTAATGTTAA TTCAGAAATA TAAGGTGCTA 7560 

CGGTAATACT CATTACAATA AAACTCGATT 7620 

CAATAGCATT ATATTTCGCA ACATATTTCG 7680 

CTGCAATTGC AATATTATAT GGTGCATAAG 7740 

CACCAATTAA ATAGTTGAAT GGAATGATaA 7800 

TACTAATGGT AATTAAAAAG GTTCCACGCA 7860 

TCCCTATCTC ATGTTTATTA AAGTTTTGTA 7920 

TTTCATTATT ATGAATATAT CACAAAACTT 7980 

CATAACAAAA TTATCAACAC ATTGTCATTG 8040 

ATTTCATATA AACATTATAC TACTATTTGA 8100 

TTCTTATCAT ACACATACAT CTTCATTTTT 8160 

ATAATTGTCA TCTCATAACA CAAGAGATAT 8220 

CAATTGAATG TCCAGTAACA AATTTGGAGG 8280 

GAGGCGGACC TAGCGGCTTA ATGGCX3GCAG 8340 
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GTAAACTCAA AATATCTGGT GGCX3GTAGAT GTAACGTAAC TAATCGATTA CCATATGCTG 8460 

AAATTATTAA GAACATTCCT GGaAATGGGA AATTTTTATA TAGTCCCTTT TCAATTTTTG B520 

ATAATGAATC CATCATAGAT TTTTTTGAGT CTAGGGGTGT TAAATTAAAA GAAGAAGATC .8580 

ACGGGCGTAT GTTTCCAGTT TCCAACAAAG CACAAGACGT GGTTGATACA TTAGTGACAA 8640 

CTATCGAACG CCAACATGTA ACGATTAAAG AAGAAGAAGC TGTTAGTAGA ATCGAAGTTA 8700 

ATACAGACCA AACTTTCACT GTACATACTC AAAATAATAG TTATGAAAGC CATTCGCTAG 8760 

TGATTGCTAC AGGTGGTACA AGTGTCCCTC AAACTGGTTC AACTGGTGAT GGTTATAAGT 8820 

TCGCACAAGA TTTAGGTCAT ACCATTACTG AGTTATTCCC GACCX3AAGTT CCAATTACAT 8880 

CAGCTGAACC TTTCATCAAA TCCAATCGTC TAAAAGGTTT AAGTTTAAAA GATGTTGAAT 8940 

TGTCAGTACT TAAGAAAAAT GGTAAAAAAC GCATCAGTCA TCAAATGGAT ATGTTATTTA 9000 

CTCATTTTGG TATCAGTGGT CCAGCTGCAT TAAGATGTAG TCAGTTTGTT TATAAAGAAC 9060 

AAAAAAATCA AAAGACACAG CACATTTCTA TGGCAATCGA TGCATTTCCT GAATTAAACC 9120 

ATGAACAATT AAAACAACAC ATCACATCAT TATTATCGGA CACACCAGAT AAAATCATTA 9180 

AAAACAGTTT GCATGGTCTA ATTGAAGAGC GCTACTTACT GTTCATGCTG GAACAAGCAG 9240 

GAATCGATGA AAATACCACA TCACATCACT TATCAAATCA ACAATTGAAC GACTTAGTAA 9300 

ATATGTTTAA AGGGTTTGTA TTTAAGGTGA ACGGGACATT ACCTATAGAT AAGGCATTTG 9360 

TCACAGGTGG TGGTGTGTCA CTTAAAGAAA TTCAACCTAA AACAATGATG TCTAAATTAG 9420 

TTCCGGGATT ATTTTTATGT GGTGAAGTAT TAGATATACA TGGTTATACT GGTGGTTATA 94 80 

ATATTACAAG TGCACTCGTA ACAGGACATG TCGCTGGATT ATATGCCGGA CATTACTCAC 9540 

ATGCATCAAT GGAATAATAG TATAAAATTT GGTTCGATTC TCTTTAGTAG ATCAACTTTT 9600 

TCATTCAAAT AAAAATGACC TTAATATAAC TGAGTCACTA AAAAGTGTCG TTATATTAAG 9660 

6TCATTTCGT TAATTATGAT TCTTTTTCGT TTTTAGTACG TCTTCTAGCT AACAAAGCCG 9720 

CACCTGTAAT CAGTGCAAAT TCTTTCAATG GTAAATCCAT TCCTTCAGAA CCTGTATTTG 9780 

GAAGTTCTTT TTCAACrTTTG CGCGATTCAT GTGTCTCTTC TTTTTTAATA GGCGTACAAA 9840 

CTTTTGGAGC TGGCTGAATT TCTTTTGGTG ATACTTTCGT OGCTTCAGCT GGTAATTTAA 9900 

TTGCTAAAAT TTCATCAACA ATGAATTGCG TGTGTTGTTT GATGTCATTT AATGTCGCAT 9960 

CTTCATCAAT CATTCTATTG CCATCTGCAA CATATTGATC AATTAATACT TTTACTTTAG 10020 

CTAATTGTTC TGGTGTTGCG ATCGCTTTGA ATTTOGCATA TGTTTGTTGA GCAATGTTAT 10080 

CAATTCGCAG TAAGCTATTT TCTTTTTCAG TAATTACTGC TTCTATATCG CTTAATGCAA 10140 
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CATCCATTTG TAATTTTAAA GCAGTTATAG CTTTTAATGC ATCAGCCTTA TTACGATTAC 10260 

TTACTTTTCG ATAATTTTGC ACTAAAGCAG TGACGCGTGC AAGATCATCA TTAATCGTTT 10320 

TTTCAGCATC TGGCTT m 'A ATAGGATGTA CATCTAAATC AT6TATTGTT TGTAGATTTA 10380 

ATGATGCTGT TTTATCAACT TGTGCATTGC TACGATCTTG ATCAATTTGT CCAATAGCAG 10440 

TGTCATAAAT ATTTTGTAAC TGTGCTAATA TACTATTTCT TTCTTCTACC GTTGCTTGAA 10500 

TATTCGCTTC AATTGCTTGT TTTTTATCGT TGAATAATGT TGTCAATTGT TCTCGAGCAG 10560 

ACGCCTTTCT GTTAATAACA GGTTCGATTT CACGAATTTC GTTTTTCTCA TCATGCAATA 10620 

AATATGCCAC ATCTGCATTA GTCACTGCAC TAGCAATTTG TTGTTTAGCT TTAATTAACT 10680 

CTTTTTCAAC TTGTGCTATT GCAATATTTT GTTCTTCATC TGTC6CTTCG TTATTTGCTT 10740 

TAATTAAATT AATTTTATTT GTAGCGATAT TTTGAATTTG TT6TAATGCT GTTGCTTTAA 10800 

CTGTTGTCGC TGGTTTAATT TTTGAAATAA TATTTTGAGC ATTTATACTA TCTTGATTAA 10860 

CTTGGGCAGT CTTATCTGCA TGATTGATCT GATCAATAGC CTGATTAAGT GCTTGTTCTA 10920 

CTAAATGTTT AGCAGCTAGT CTTTCTTCTT CA6TTGATAA ATCGCTTTGA TCGATTAGTG 10980 

CATTTTGAGC TTCGGCTTTT ACACCAACAG ATTGACX3CX5C TGCTGGTTTA ACTTGAACTT 11040 

TAGGTAAAAT CACTTTGATG TTGTCGTTGC CATCAGTCnC AGTnCGATCC ACTTCTGCAT 11100 

TCGTTTTGTT TTGTGCAATG TCATTT 11126 
(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

TTGCCCCGCA CGGCGGTGTG nTTCCTAGAA ATAATGAATA TAAAGaGAAA TATATAACAA 60 

CGATTTTGAA TTATGAACCT GGTGATATCG TTACAATCAA ACGTGTGAGA GATAAGACCG 120 

ATTT6CTAAT ATATTTGTCT AGTAAAGATA TTTCTATTGG TAATGAAGTG GAAATTGTAT 180 

GGAAAGATGA AATGAATAAA GTAATTATCA TTAAACGTAA TGATAATGTA ATTATTGTCA 240 

GTTACGAAAA TGCAATGAAC ATGTTTGCTG AAAAATAAAA TAAAGAAGCC ATAAAGATAT 300 

CCAT6ATTGA ACTGATAAAG ACATATGGAT AATTGCTTTA GGCTTCTTTT TTATTAGTTA 360 

ATTTATCAAG TGAGTATATT TGAGTAAAAT ATTCACTGCA TAAAGATTGA AGATAATCCA 420 
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CTGTGGACTC GGACGCTGGA AAGTCAATTT AGCAATCGTC CAACTAGATT GTAGAACTTC 540 

GCCTAATAAT ACACCTAAAA TATATTGATA ACTCATTGTG ACAAGTAGTT GAATTTCTAC 600 

TATATTTTCA TCTTTTAATA TAAAATACAA CATGATAGAA ATTAAAGTTA TAACAACAAT 660 

GGGTGAGCCT TTTCtAGATG TTAAAATTAA AAAATAAATA AATATCAATA AATAGGTAAA 720 

TATAAAGAAA CTAGGTATCT GATAATGGCT CGACGCTAAA CCTATCAATA ACATAATAGG 780 

TGGCATAAAA TAACCACCAA TCGTTGTAAG CCATTGGCCT GCTAGATGTC TAGATTGTGT 840 

AATTGCGAAT CCTTGTTGTA ATGTCTGTTG TCGCTCTCGT GGACTTGTTA CAATGACTAA 900 

ATCTTTTGCA CGGCCACCAG CGAGTTTATT AAACAGTACA TGACCAAATT CATGTGTTAA 960 

AACAGGGATA TAGTTTAAAA TGACATCTAA ATAGTTCAAA ACAGGCTTAT GTCTATATTG 1020 

ATGAATAGCA ATATAACAAG CTGCAACAAT AACX3ATAATG TATATATTAA GTTGAATTGT 1080 

CGTATTAAAA AAGTTTGATA AATAATTCAT TGTTAACCTC ATATAAGATA TTAATTTAAA 1140 

GTTTGCTTAT CACTTATTAT AAATGATATT GGCATCAATA GCGTTAGACT TTAGACTTAC 1200 

CTTAGTTAAA CTAAT T T TA A TTTTTGAAAA GGTGAATATG TGTTAAAATA AAGCAAAATC 1260 

25 ATTTCGATAT AAATAGGATG AATATAAATA CTGTTAATAT TGATTACACT AACATAATAA 1320 

TGAAATAAGA TAGGAGATTC CTGTTATGAC TGTTGAAGAA AGATCCAATA CAGCCAAAGT 1380 

TGACATTTTA GGGGTCGATT TTGATAATAC AACAATGTTG CAAATGGTTG AAAATATTAA 1440 

AACCTTTTTT GCAAATCAAT CAACGAATAA TCTTTTTATA GTAACAGCCA ACCCTGAAAT 1500 

AGTGAATTAC GCGACGACAC ATCAAGCGTA TTTAGAGTTA ATAAATCAAG CGAGCTATAT 1560 

TGTTGCTGAT GGGACAGGAG TAGTCAAAGC TTCGCATCGT TTAAAGCAAC CTCTAGCGCA 1620 

TCGTATACCT GGTATTGAGT TGATGGATGA ATGTTTGAAA ATTGCTCATG TAAATCATCA 1680 

AAAABTATTT TT6CTAGGGG CAACTAATGA AGTTGTAGAA GCGGCACAAT ATGCATTGCA 1740 

ACAAAGATAT CCAAACATAT CGTTTGCACA TCATCACGGT TATATTGATT TAGAAGATGA 1800 

GACAGTAGTG AAcGnAnTTA AACTGTTTAA ACCTGATTAC ATATTTGTAG GTATGGGATT 1860 

CCCTAAACAA GAAGAATGGA TTATGACACA TGAAAACCAA TTTGAATCTA CAGTGATGAT 1920 

GGGCGTAGGT GGTTCTCTTG AAGTATTTGC TGGGGCTAAA AAGAGAGCGC CTTATATCTT 1980 

TAGAAAATTA AACATTGAAT GGATATATAG AGCATTAATA GATTGGAAAC GTATTGGTAG 2040 

ATTAAAGAGT ATTCCAATAT TTATGTATAA AATAGCCAAA GCaAAAAGAA AAATAAAAAA 2100 

50 GGCGAAATAA TCATGATGAC AAAAATAAAA CCGAGGAAAT CCTTAAATGG AGATTCTCGG 2160 

TTTTTTCGGT TTATTTAATA ACGAAGCGGG ACTCATCGAG TTTGTTTCTA AATTCTTTTT 2220 
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CATCAAGTTC ACCX3TAATCT TTTAACTTTC CGCCTTCAAT CCAAGCAATC TTAGTACAAA 2340 

ATTGTCTCAC TTGTCCTAAG TTATGACTAA CGAAAAAGAT GGTTTTGTTT TGCTCTTTAA 2400 

ACTCGTAAAT TTTATCTAAA OVTTTTTGTG CAAAAGTTTG GTCACCTACA GATAAAGCTT 2460 

CGTCAATGAC TAAGATATCT GGATTAACTG TGATATTAAT TGAAAAACCA AGTTTTGCAC 2520 

GCATACCACT TGAATACTTT TTAACTGGTT GATAAATAAA CTCACCAAGT TCACTAAATT 2580 

CAATAATCTT AGGTGTCATC GCTTTAATTT CTTTTCGCTT AAAGCCCATA CATAACATTT 264 0 

TAAATTCGAT ATTTTCAATC CCTGTAAGTT GTCCACTCAA GCCAGCACTA ATTGCGATAA 2700 

CGCTGACTTC ACCATTACX3A TCCACTTTGC CAACAGTAGG CGACAAAGAA C06CCAATGA 2760 

TATTGCTCAA CGTTGATTTG CCGGAACCAT TGATGCCAAC AAGCCCTATG ACGTCACCTT 2820 

CATATGCTTT TAAACTAATG TCATCTAAAG CGAAAAATGT TTTGTTTTTA TGTTTGGGAA 2880 

TGAGCGCATC TTTCATACGT TCTTTATTTG TACGATAAAT ACGATATTCT TTTGTTACAT 2940 

TTTTAATGTT TACCGAAACG TTCATTTGTA GACCTTCCTT ATTCACATTT ATCTAGATTA 3000 

TAATATACTA CTCAACAGTT GTTAAATTTT AAAACCTGTT GTAAAGTGTA TAGAAGATTT 3060 

25 TGTTATTATC AGAGTGGGTG TTTTGACACA AAATGTTAAT CATCAATGAT AACAATGATA 3120 

TTTAAAAACT AAACTTATTT CAACTTACAT GATTGTATAC TATAATGTAT TTGTAATAAA 3180 

CTAATATTTT AAAGAACTAG ACAATAATTT TGATAGCATC CATGTATAGT GATAGTATTT 3240 

ACAACAATTA TTATAATACT ATTTAGTTAA GTAGAGAAAT AGTTAAACAT TTGAAAGTGT 3300 

GGTTTAATGG AATGTCAGCA ATAGGAACAG TTTTTAAAGA ACATGTAAAG AACTTTTATT 3360 

TAATTCAAAG ACTGGCTCAG TTTCAAGTTA AAATTATCAA TCATAGTAAC TATTTAGGTG 3420 

TGGCTTGGGA ATTAATTAAC CCTGTTATGC AAATTATGGT TTACTGGATG GTTTTTGGAT 3480 

TAGCftATAAG AAGTAATGCA CCAATTCATG GTGTACCTTT TGTTTATTGG TTATTGGTTG 3540 

GTATCAGTAT GTGGTTCTTC ATCAACCAAG GTATTTTAGA AGGTACTAAA GCAATTACAC 3600 

AAAAGTTTAA TCAAGTATCG AAAATGAAcT TCCCGTTATC GATAtACCGA CATATATTGT 3660 
(2) INFORMATION FOR SEQ ID NO: 173: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 
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ATTAATCACT TGTTGTGTAG AGTCTTGTCC GTTTTGGTTA TGATTGTTAG CCATCATATA 120 

CCTCCCTTAC AACACTCGTG GACCAGAAGT TTTCTGATCT CTCACATTAA CTTCTAACTT 180 

^ ACGTACTGGC ATTTCTGTGA AATATTCTAC ATTCTTTTTA ATATCCGAAC GAATTGCTTC 240 

AGTTAAAGAT TGAACTTGAA CATTATTTGG TACGAAAAAG TCAGTTTTAA TGTCGATATA 300 

AGATTTATTT TTTTTGTTAT ATAGTTTCGC AACTACATTT GGTTGTCTTA CTTGATCATA 360 

10 

TTTTGCAACC GTATCGAATG CCGTCTTTTC AACAGCTTTA CGAGATACGT AAACATGACC 420 

ATCATCGAAG TCTTTGTATA ATCCAGGTTT TCGATGCGTA GGTTTGAAGA TACTAAATAC 480 

TAATATAAGA CCTATTAATA TCAATA6TGC AGCAAGTGAA ATAAGTAAT6 GTTOGAACCA 540 

IS 

TTCAAATTQA AGGAAGTAGT CTTGATATTC AGTTATACGT CCATCTTGGA TATACATGAA 600 

TAACAGGAAC CCCACGaTTA CTACTATTAA TAA6CCAAGG ATAAAGTTTT TAAGTOGTTT 660 

20 CACCCCTAAC GACACCTCCT TAGTTAAAGT TAATTTAAAA ACATATTAAA TATGTACCCA 720 

TCAGTTTTTT TCTTAAACAT AATAAATTAA TAACTTTAAA TTTATTTTTA ATATATAAGA 780 

TGAAGTACCA TTTAGTAATA TATTCCCTAG TTTTTGTAAA TAAAACCTCA TTATTAATTA 840 

2S ATTyTCGTCA ATATGTTTTG AAGAACGATA TTCTAAAATA TCTGGGTCAC GATGTTTAAT 900 

TAAAACCTTA TTACTATTTC TCGGTTTCTC CTCACTCAAA GATTTTATAA GCGACCATAT 960 

CATCGCTATA ATGACCACGG AAAATGGTAA CGCAGCAATG ATTAATAAAT TTTGAATTGC 1020 

^ TTGAGTACCA CCTGTGTAAA TCATGATGAT TGCAAATAAT GCCATAATGA TACCCCAACT 1080 

CACTTTGACA AATGACTTCG GATTAATATC ACCACTTGAA CTCAACATAC CTAAAACATA 1140 

AGTTGCCGAA TCCGCTGATG TAACAAAGAA AATCATAATA ACAAGTAAAG TAATTAAGCT 1200 

35 

TAATACAAAA CCTAGCGGAT AATGTTGTAG CGTCGCAAAA GTTGCTGTTT CTGTCGCAGC 1260 

TTTA2cAATA TCGGCAATAT GATTATCTTG TAAGTAAATT GCTGACGCGC CGAATACCGC 1320 

AAAGAATATA AAGCAAACTA ACGCCX3GGAC AAAAAGTACA CCTAGAATAA ATTCTTTAAT 1380 

40 

CGTACGTCCT TTTQACACAC GTGCAATAAA TATACCTACA AATGGTGCCC AAGATATCCA 144 0 

CCATGCCCAG TAAAAGATTG TCCAATTTTG TAACCATTGG AATTTTTGAC CACCTGTCGG 1500 

45 AATGCGTAAA CTCATACTAA AGAAATTTGC AATATAATTA CCTAGACCAT TCGTAAATGT 1560 

ATTTAAAATG TATAGCGTTG GCCCAACAAT AAAAAGACCA ATAAGTACTA CAAAAGCAAO 1620 

TAACATGTTG ATATTACTCA ACGTTTTGAT ACCTTTATCG ATACCTGACC ATGCTGACCA 1680 

50 AGTAAATAAT ATGGTTGCAA TGACAATCAA 6ATTACTTGC ATCGTGAAGT TACTCX3GTAC 1740 

ATTAAATAAA AAATGTAAAC CTTCGTTTAT TTGCAATGCA CCGAAACCTA ATGTTGCAGC 1800 

SS 
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CATTGCCTTT TCACCTAATA AAGGCGTCAA TGTAGCGCTG ACTAAGCCAG GATATCCTTT 1920 

ATGAAAGCTA AAATATGCAA ACACTAGCGC GACAATACCA TAGACTGCCC ATGCATGAAT 1980 

^ CCCCCAATGG AAAAATGAAA ACTGCATTGC ATCATTAATT GCAGATTGCG TGCCAGCTTT 2040 

ATGAATAGGC GTTAATTTGA AGGCATGACT GATTGGTTCT GCCGTTGTCC AGAACACAAG 2100 

TCCTATTCCC ATACCAGCAC TAAATAACAT AGCAAACCAA GACGGCAATG AGAATTCAGG 2160 

10 

ATCTTCGCCT TCTTCACCTA ATGTAATGTT TGCGTATCTC GAAAATAAAA TATACACACA 2220 

GACAAATAAA ATAACTAAAA CGAGCAATAA ATAATACCAA GAAAAATGTA GCGCAATAAA 2280 

TGTAGTAATG TTTTGCGTGA GTTTTTCTAA CTGTTTCGGA AATATTGCTC CAAAAGCAAC 2340 

75 

AAATATCGTA CATATCACTA AAGATACCCA AAACACTAGA CTTACTGATT TATTTTTCAT 2400 

AAATACAAAC CCTTTCTGTG TAATGGTAAG TTCATACCCA TAACTGCAAC ATTTTAATCA 2460 

2^ TTTGTAATTT TATATAGACA CAATTAATAA TGCCTCATCT TTTAAAAATG ATATATAAAA 2520 

CACACTCAAA TTATTTATCA TTGAGCAACA AAGTATTTTA TTGTATTTAA GTAATGCCTT 2580 

TCTAGTGCAT TATTGATTTG ATACCTGCAA AGTTGCCATA TTTCCGTTTA GAATCAATAG 2640 

25 TCGCTAGACA CAAAAAATAA GTCGCCTATA CAGTATTTTC TGCATAAGGC GACTTTACTT 2700 

ACTAATCTAT ATATTAATTA CTAATTTTCC AATCATTGAT TGTTTTTCCA ACAATTGATG 2760 

TGCTTGATAT AAGTTTTCAG GTGATAAACC TTCAAAAACT TGTGTCGTTG TTGGTTGGTA 2820 

^ ATGCCCTGAT TCTATATTTT TCGTAATATC TTCTAAATAC TCATGTTGTT TAATCATATC 2880 

AGGCGTTCGA TGAATTGGAC GCGCAAACAT AAATTCATGT GTAAATGTTA TACTTTTTAA 2 940 

TTTTAATGCA TTTAAATCTT GATCTTCATT AAAAGCTACG ATAGTCGTAA TATGCCCTAA 3000 

35 

TGGTTTTATC AGTTCAATCA TAGTATTGTA ATACAAGTCT GTATTATAGG TGCAAAATAT 3060 

ATAATCTACT AATGGAATTT CTTTAAATTG ACGCACTAAA TCCTCTTTAT GATTCAATAC 3120 

GATATCTGCG CCCATCTTTT CACACCACTC T6TTGTTTCT TGTCGTGATG CTGTTGTAAT 3180 

40 

GACAGTTAAA CCATACCGTT TAGCAATTTG AGTGGCTATA CTGCCTACAC CACCGGCACC 3240 

ATTAATGATT AAGACAGACT TCCCTTCGTT TTCAGCAGGA TTCGTAGAAA TTTTAAATGT 3300 

^ ATCAAAAAAC GTTTCATATG CCGTAATACC AGTTAGCX3GT AGACTAACCG CTTCATTAGC 3360 

ACTTATGTTG TGTGGTGCTT TTGCAACTAT AGCTTCTGAC ACCAATTGAT ATGTCGCATT 3420 

TGATCCTTGT CTATTTGGCG ATCCAGCATA AAATACAACG TCACCCGGAC TAAATAATGT 3480 

SO AACGTCTGGT CCGATAGCTT CAACAGTACC AATAGCATCA AACCCAAGTA CACGAGGTGC 3540 

TTGAGTGACT TCCATTTGTC GTTGCTTTGT ATCTACAGGA TTTACACTAA TGCTATTTAC 3600 
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ATTTCCTTCT TCCAATTTAA AGGGCTTCTC AAATCCTATC ATTTTCATAT CGTTTCACCT 3720 

CATTTATGAA CTTATTTCTT ATTATACAAA ATAGAAGCCA TGTGTGCTTA TATCGCAGCA 3780 

^ TCATGACTCC TTTTTCATTT GAATATATAA ATAATTACAG ACGACTTTCG TATTAAATTT 3840 

TAGACTTATT TCTACC7VTGT TGCTGAACAA ATTTACTTTA GATAT^AAAAT TATTAAATTT 3900 

TGGTCAATTA ACAAAGTTAG TTTGTTAAAA CGTgATACTT TATTATTCCG TTACTTTAAT 3960 

10 

AACTTGTTTA CCAAAGTTAT CGCCAGTaAA TAAATTTTTA AATGCATGTG GCGCATTTTC 4020 

AAAACCATCT TCAATGGTTA CTTGTGACTG AATTTTACCT TCTTGAACCC ATGTTGCAAG 4080 

CTGTTCACTA GCTTCTTTAA AAGCATTAGC GAATTCACTT ACCAAGAAGC CTCTCATCAT 4140 

IS 

TACTTGCTTC TTAATAAGCG TACCTTGAAT ACGTGGTCCG ATATCGGCTT CAGGATGATT 4200 

ATATGACGAA ATTGCGCCAC ATACTGGTAC ACGTGCAAAA CGATTTAAAT GCTTAAATAC 4260 

20 TTCATCGCCA ACTGTTCCAC CAACATTTTC AAAATAAACA TCAATACCAT CTGGTACTOC 4320 

TTGTGCTAAC GCTTCTGCAA AATCCTCTTT CTTATAATCA ATACCAGCGT CAAAGCCCAG 4380 

TGTCTCTGTT AAATAATTTA CTTTTTTGTC OCCACCCGCA ATACCTACTA CACGGCAACC 4440 

25 TTTAATCTTA GCAATTTGAC CTACAACTGA ACCTACAGCA CCAGATGCAG CTGAAACCAC 4500 

AACAGTATCA CCGGCTTTAG GTTGTCCAAT ATCAAGCAGA CCATGATATG CTGTTTGTCC 4560 

TGGCATTCCT AAAACACTTA AATATAAATC AAGTGGTACA TCTGTCGTTG GAACTTTAGT 4620 

30 

AATTTGATCC GCTTGGACAT GATTAATGAT TCGCCAAGGC AACATACCTA CAACGACATC 4680 

TCCTTTTTTA TAATCTGCGA GTGTCGAATC AATTACTTTT GCAACGACAT GGCTAACAAT 474 0 

CGGTTTACCA ATTTCAAAAG GCTGTACATA CGAATCTGCC TTAGTCATAC GTCCTCTCAT 4800 

35 

ATATGGATCC ACTGAAATAT ACAGCGTTTG TACAAGTACA CCATCGCTCT CAAGTTTaGG 4860 

CGTGTCAATC TCTTCaATTT TGAATGTATC CTCTTGAGGC ATGCCkTCTG GTATTTTGTT 4920 

AAAAAGAATT TGTTTATTTT GCATCATTAA TCACCTTTCT TTATTTGAAA CTTTTACTTA 4980 

40 

TTTGTTACTT AAGCGTTAAG TTTGAATTGT GTCtTCGTGA TGTCTGTATG CAAATACATT 5040 

CTTAGTTGTT ATATTTTGAC TTAAGCACTG ATTCATTCAT GTAACTTCAA CCACATTATA 5100 

45 TTTGCTATAA TCATAAATTT AAAATGTTAC GACTTAGACA TTTTATGGAA ACTCTCAAAC 5160 

AATAGATAAT TTTTGAAAAG CTCTAATATT ACAAGCTTTT TTGCCCCAGA AAAACTAGCA 5220 

GTTGCTTTAT TTTTTCCATA AGAAGTCX5AT TAACTCATTA GCAACATTTT CATTCTCATG 5280 

AAGCTGACTA TGTTGTGCAG GCTCACCTTC ATATTTAGAT TCTCGATAAC TTTTCGGACT 5340 

ATTTCCCAGT AAATATTTTA ATGATTTCGA AGAACTATTA GACACTCTGC CGTCTGAATG 5400 

55 



844 




845 



EP 0 786 519 A2 





CTACTAAATA TTGACCATCA CCAATAGGTC CAATTTCATT GAATGTAGTC CAATATTTTA 


7320 




CTTCTGGGAA TTCTTTAAAA 


CAATATTCAG 


CATAATCTAC 


AAAGTAGTCA 


ATCGTTTTAC 


7380 


5 


GATTTAGAAA ATCGCCATCT 


TTGTGTAaCA 


CTTCTGGTGT 


ATCAAAATGA 


TGCAATGTTA 


7440 




CAAATGGTTC AACATGACGT 


TTATGACACT 


CTGCAAATAA 


CTTATGGTAA 


TACTCAACAC 


7500 


10 


CTTTAGGGTT AACTTCGCCA 


TATCCATTTG 


GGAAGATACG 


AGACCATGCA 


ATTGAAATTC 


7560 


GGATACCATT AACACCGAAT 


TTTTCACTTA ATTCTAAATC 


CACTGGATAT 


CTGTTATAAA 


7620 




AATCACTCGC TGGTTCTGCA GTGTACXaAT AGTTTTCTTC 


TAAATACGTA 


TCCCATGCTA 


7680 


IS 


CX5CGACCTTT ACCATCCGTA TTTGTOGCAC CTTCTGCTTG ATATGCTGCT GTTGCTCCAC 


7740 




CAAAAATAAA ATCTTCAGGT AATGTTTTAG 


TCATATGAAA AACTCCTATT 


CTTAATTTTC 


7800 




AAATTGTTGT TGAACGAAAT 


CAAGGGCTGC 


TTGGCCATCT 


CGTGTCAATT 


TGATATATTC 


7860 


20 


AGCACCTTGA GTCTTCGCTA ATTTAATACC 


TAATCTATCT 


GTATCTTGCT 


TAATATCTTC 


7920 




ATAGTTAGAC GCAACTTGTG 


GCGCTAAAAT 


GATTAATTGG 


TACTCTTTCA 


TAATGTCCAT 


7980 




ATGTGCGCCA TATCCGCCAG 


cTGCCGCTTT 


CACTGGCACA 


TGATATTCTT 


CAGCTGCTTT 


8040 


25 


ATTAAGTGCA TTGGCTAATA 


ATCCACTTGT 


ACCACCACCG 


GCACAAAGTA 


CTAAGACATT 


8100 




TGTTTGTTCT GTGATATTTG 


AAGCTTTAGC 


TGCATCGTCT 


GATACACCAC 


TTGCCGCTAA 


8160 




AATTGAATCA GL"l"i"l"riTCG 


TATCAAAGTT 


TGCTGCAACT 


TTTTCTTTTA 


AATCTGAATT 


8220 


30 


ACTTTCTTTA CGTCCTTCTT 


CTTCATCAAG 


AATTTCACTA 


TCATAAACTT 


TTAGGAATGG 


8280 




GTAGTAAATA ATAATATCTA 


CAACAATCAA 


AGTAATAGCT 


AGTACGAATG 


ACCATAAACC 


8340 


35 


AAAACCTGTA CCCATGATAA 


TGCCCAATGG 


ACCTGGTGTT 


GTCCAAGGTA 


AATTCACACT 


8400 


AAAACTATTC ATTCCTAACA 


CTTCAACX3AA AAGTTTGAAA A7CCATACGT 


TAACAATTGG 


8460 




TGd^TACA AATGGAATAA 


AGAACACAGG 


ATTCAATACT 


AGTGGTGCAC 


CAAATAAAAT 


8520 


40 


TGGrrCXSTTT ACACCAAAGA ATGTTGGTAC AACTGATGCA CGTCCAATCG 


CTTTGTTTCG 


8580 




TTTAGATTTC GTCATCCACA 


TAAACATGAA 


CGGGAC6ACC AATGTTGCAC 


CCGTACCTCC 


8640 




AAATGTAACG ATAAACATTT 


GTGTACCTGA TGTAATAATT 


TTATCTGCGT GTTCTCCAGC 


8700 


45 


TTGAAGCAAC TTGAAGTTCG 


CTTCGATATT 


CGCATATGTA 


ATGGCTGCAA 


TTGCTGGCTC 


8760 




TACAATTGAC GGACCATGAA 


TACCTACAAA 


CCAGAATAAT 


GCAAAGGCAC 


CAAAGATAAT 


8820 




TGTGACACCA ATCCATCCAT 


CTGCTGCTGT 


AAATAATGGT 


TCGAATAATT 


TTAAAATACC 


8880 


SO 


ttccxx:taca tttgatttaa 


AGCTGTTGCG AATGACTAAA 


TCTAATGCAT 


AAAGAATGAT 


8940 




GATTACCGCT GAAAATGGAA 


TTAAGTCCTT 


AAATACTTGT 


GAAATATTCG 


GCGGTACTTC 


9000 
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10 



IS 



20 



2S 



30 



3S 



40 



4S 



SO 



TCATCATAAT 
AAAGTATTGG 
ATTCCAGCAA 
TTCACTCTAT 
ATAATCCACT 
CTTTCATCAA 
GTAGGTTTGT 
ACAGGCACAC 
CCTTTAGGTA 
TGTTTAATAA 
GTTTGTTGCC 
AATGCATGCT 
CCAATAAAAC 
AAACCTTTAC 
AATGCTGTTA 
TGTGTCGTCC 
AAATGTGGAT 
TTTTTGTTCT 
ATTAATAACT 
TTAGCGTATA 
TTTXCAGCAT 
CCGCTAACAA 
TCGTATGTTC 
ATTTTTGTAT 
CCTTAGCACA 
CCTTCAACAA 
ATAGTGATCA 
ACrrCAGCAQ 
GCATCGATAA 



TATTTAAATT 
CTTTTTTTAG 
CTGTAGAATC 
AGAATGTATG 
CAATCCCTTC 
GTGGCTGGTT 
ATGGATTTTC 
CTTTGTTTTG 
AGCTACCTGA 
ATCCTGCAGC 
CTTCATGTAA 
TGATGTCX5GC 
CACTCGCAAG 
CACCAGCTGT 
ATGGGTATGA 
CTTAATCGTG 
CAGCTTGATC 
CTTCAGTTGG 
CACCTATAAT 
ACGCTGATGT 
TGTTAATACC 
CTTGTTCGCC 
CTACGTCAAT 
CTGTAACAAT 
TTTTATTAAG 
AACCTTTAAC 
TTCTTGAATT 
CAATCATGCC 
CAATACCTAA 



GACATAACCT 
TAAATCGTGG 
ACCTGAACCA 
ATTGTGCTTA 
GAATAAGGGT 
AAGCAATTGA 
CAAAACTGTT 
GCATCGTTCA 
AATAGCAACT 
CTCTTGATTA 
AATTGCAATG 
ATGATCTAAT 
GACTGGCTCA 
TTTACTTACT 
AATATCAACG 
GTATTCGCCT 
TGCATTGCTT 
TTTATATTCA 
ACGTCCACCG 
CATATCACGT 
AACACCTGTT 
AACTTTT T TA 
GACTTCATGT 
ATGGTCGCAT 
CATATCTACG 
GACATTTTTC 
GTTATGGCCT 
TTTGATTTTT 
GTTACCTTCT 



GTTTGTGCTT 
TCGTTTTCAT 
ACCGGATTTA 
GCGAATGCAC 
TGTGACACTG 
TATAGTTCAG 
TGCAAAGTtG 
ATGATTTGTG 
GCTTCAACTT 
TCAATCTCCG 
CAGTTTCGTG 
TTTTTAGCAA 
CCTACTTGCG 
TCTTGAACAC 
GATGGATTTA 
CTGTCCCATT 
GTTTCTAAAT 
GCATTAATAA 
AAGCCAATAA 
ACTAGTGCTG 
CCACAAATAC 
CCAAAAATTG 
CCTTTTGATT 
CCTAATGCAA 
CGGATTTGGT 
GCTAATGTGT 
CTAGTCATAT 
GTAGCAACCA 
TGACTTTGAA 



CTTGTGCATT 
GATTAAGAAT 
ATACACTTAT 
CTTGTGCACC 
CCTGTTTCAA 
AAATGTTTGG 
CACCCGAGCA 
CATAATAATC 
TTTCTAATAA 
GTCCCTGCTC 
TTTCACCCTT 
TAAATTGACC 
CAAGTACTCT 
GATTAACATC 
ATGTTAAAGT 
TTTCTAAGAA 
GTTTAATTTT 
ATGCATCGAT 
CGTTCGCATT 
AACGAACGCC 
AAACACCTAA 
GATAATGTGT 
TTAAAAATTC 
TCTTCATAGT 
GTCTACCACC 
CTCCAACAAT 
ATCCAGAGCG 
TAAAGCTACC 
CATCTTTTGC 



CAGCATGCCT 

TGCTGAAGTA 

TGTCGGAATA 

TAAAGACACA 

ACTTTCTAAA 

TTTAATGACT 

ATCTAATATC 

TTGATTTAAT 

TTGTTCAAAA 

TAAAATTTCT 

AATGTTATAA 

TAATTCACCG 

TGTTACATTT 

ATCTAATTTC 

TAAAATCATA 

TTCATCAAAG 

AGCGATTAAT 

AATATCGCAC 

TAATTCTTCT 

AGGTACTTTA 

GTCTGCATTA 

TCTTGTGAAA 

AGATACACGC 

AATTTTTCCT 

ATCGTATTTA 

TTCAGATCCC 

TTCATCTGAT 

TGCACCAAAT 

TACAGCCAAA 



10920 

X0980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 
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TO 



75 



20 



25 



30 



35 



40 



45 



TCTAATAAGT ATGATTTGAT GACTTCTTTT AATCGTTTGC CAGCTTCATC TGAACCAATA 12720 

ATAATCGCCA TAATAAGACT CCTTTTTACT TTAATTTTGA AATACCTTTC TTAAAATGTG 12780 

ACATATTTAT TTGTAGGTTA TGAAAATCTT GAGAAAAGGC TTTCAATTTG ATTACGTTTA 12840 

AATTATAAAC ATAAACAAAC AATAAATCAA CATAATATGT TTATAATATG TTTGTTTATG 12900 

ACGTATTTTC AAACAATAAG TGAACATTCA TATTGTGGTG TTGTTTTAAT TAGGTATTCG 12960 

TCTGAAATTG TAGTAAAACT TTGTCGAGGT TCCCGTTGaC ATAAATTTGC ATAAAAAAtA 13020 

GCCCATAAAT GAATGCAAAT TCACATTCAC TTATGAGCAT ATAGATACAT ATTTTAACAA 13080 

TGCAGTTATA CTTTTAATTT AGTCGACTAC TTCAATATAT GTTTTAATCG TTTCTACTTT 13140 

TTCTTCATCT TCATA6TCCA TGACCACTGC AGTCAATTCG TTTAACTGAC AAAATGATGT 13200 

AAAATCTTCT TTGCCAACTT TCGTATGATC GATTAACAAG TATTTTTCAA TTGAATTACT 13260 

TAGTGCCAGT TGTTGCGTAT AGGCTTCATC TAATGTAGAT GTCATCACAG CACCTTTATT 13320 

TACTGCGTTA CTACTAAAGA ACATCTTGCT AAATCTTAGT TTTTCCAACA TGGCGTTCGC 13380 

CATTTCACCT ACAAATGCTT CTGTAATATG GCGCATTTCA CCACCAATTA AATAGACACG 13440 

AAAATGTGCT GTTTGTTTTT CTAACAAAAT TTTATACACC GGCAAACAAT TCGTAATAAT 13500 

TGTGAGCGTA TGATGATTGA CTTCTTCTGC TAATAGTTCC ACTGTTGTTC CTGGTCCGAA 13560 

AAACAAAGTA TCCCCATCTT CAATTAATGA TGCAGCTTTT TTAGCTATAA ATCGTTTTTC 13620 

TGCAATTTGA CGGGTATGTT TTTCTTTATG CGATATTTCT TTATACTGAA ATGTTGAATT 13680 

ACTGCGTGCA CCACCATGAA TCTTCGTTAA AATCCCTTTA TTTTCCAATT CAATTAAATC 13740 

TCTTCGAACT GTCATATCAG ACACATTTAA ACCTTCGACG ATTTCATTCG TTCTTATCGT 13 BOO 

GCCCTTTTTA TTCACTAGTT TAGCAATTTC GTCCAAACGT TCATGTTTAT TCAATGTAAA 13860 

ATTGfirrC 13868 
(2) iNPORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 54 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
50 TTAAGTCAAC TTTGTCTATA CGGTTTGGAT CtTCTaCCCA ATGTCTTATA AAAGACAATC 60 
CC6CACCTGA AACATAACTC ATGAAATAAG AAAATGGTAT ACCATTAATT TGATCATTTT 120 
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AATCTTTACC CATACC5AAAC ATCAATTGAT AAAATGCGAT GTCTTTTTCT ATCATTTCTA 240 

TTAAAACGGT CATAATTTGA TGTATGTTAT CCGTGGATAA CTTAACTGCT CCATTTAACT 300 

TCTCATCATG AATGAAGTCT CTTATTTCCT CCAACTGCTG GTCCTCTAAT TTTTCAAGCA 360 

AATCATACTT ATCATAATAA TGCGTATAAA ATGTACTACG GTTAACATCA GCTAAATCTG 420 

CAATTTGTTG CACAGTAATC TCTTCTAATT GGTGTTGATG TAAAAGTTCA ATAAATGCAT 4 80 

TTCTCATTGC AACTTGTGAT TTTCTAATAC GTCGATCTAT AGTCATTTAT ATCAAGTCCT 540 

CCCCAATGAT TATAAACGTT ATGTTCATTA TCCCACAAAT CTCCAACATT GATGATTGGC 600 

ACACAATGTT TACCTCTTTA ATATAGGTGA TACAAACAAA CAGAAAAAGG TGATAACAAT 660 

GAACCAACAT TTACTAGGAA ATCCAAAATT AACTGTAACT CATGTCAATG AAGTTAAAGC 720 

CGGTATTAAC CACATCGTTG TCGACAGTGT TCAATATGGA AATCAAGAAA TGATTATGGA 780 

AAAAGATGTC ACTGTGGAAA TGCGCGATGG CGAAAAATTA TATATTAATA TTTTCAGACC 840 

AAATAAAGAT GGCAAATTCC CTGTAGTTAT GTCTGCAGAT ACTTACGGTA AAGATAATAA 900 

GCCTAAAATC ACAAATATGG GTGCCCTTTG GCCAACATTA GGTACCATTC CGACATCTAG 960 

25 TTTTACACCT GAAGAATCAC CAGACCCAGG ATTTTGGGTG CCAAATGATT ATGTTGTAGT 1020 

TAAAGTTGCA TTACGCGGTA GTGACAAATC CAAAGGCGTC TTATCTCCAT GGTCAAAAAG 1080 

AGAAGCGGAA GATTATTACG ArTGATTGAA TGGGCAGCAA ATCAGTCATG GAGTAATGGA 1140 

AATATCGGGA CAAATGGTGT TTCTTATCTT GCGGTGACTC AATGGTGGGT CGCATCATTA 1200 

AATCCACCAC ATTTAAAAgC AAtGATTCCT TGGGAAGGCT TAAATGATAT GTATAgAGAA 1260 

GTAGCCTTTC ACGGAGGTAT mCCAGATACT GGCTTTTATC GTTTCTGGAC TCAAGGTATT 1320 

TTTGCGAGAT GGACAGATAA TCCAAATATC GAAGATTTGA TTCAAGCACA ACAAGAACAT 1380 

CCTciGTTCG ATGATTTTTG GAAACAGCGT CAAGTGCCAT TATCACAAAT TAAAACACCT 1440 

CTACTAACAT GTGCTAGTTG GTCTACACAA GGTTTGCACA ACCGTGGCTC TTTTGAAGGA 1500 

TTTAAACAAG CTGCATCTGA AGAAAAATGG CTATATGTGC ATGGACGTAA AGAGTGGGAA 1560 

AGTTACTACG CTAGAGAAAA TCTCGAACGC CAAAAATCAT TCTTTGATTT TTACCTTAAA 1620 

45 GAAGAAAATA ACGATTGGAA AGATACGCCT CATGTCATTT ATGAAGTTAG AGATCAATTT 1680 

TATAAAGGCG AATTCAAATC AGOGTCACGT GTCCCTTTAC CTAACX3CAGA ATATACACCA 1740 

TTGTATTTGA ATGCTGAAAA TCACACATTG AATCATGCAA AGATTAGTAG CGCGCATGTC 1800 

GCACAATATG ACTCTGAAGA TAAACAACAA GATGTAAGTT TTAAATATAC GTTTGACAAA 1860 

GATACTGAGT TAGTTGGAAA CATGAACTTA AAACTATGGG TAAGCACTAA AGACTCAGAT 1920 
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GATCGTATTG TTTTAAACCA TCCACACCAA CACTAAAATC AGCAAATTGC TTCACAAATT 3840 

TCGCTTTATG TTCAACACCA TAATTTAACA TATCGTGATA AACCAATACT TGACCATCTG 3900 

^ TACCTTTTCC TGCACCAATA CCAATGACTG GAATTGTTAA GTGCTTGCTA ATTTCTTCTG 3960 

CTAAATCATT TGGAATTGCT TCAAGTACTA ACGCAACTGC ACCAGCTTGT TCTACATTTT 4 020 

TCGCGTCTAA AATAAGTTGc TCCGCTGCTT CTTTCGTTGC ACCTTGTAAT TTATACCCCA 4080 

10 

TAACGCCAAC ACTTTGAGGT GTTAATCCTA AATGTGCAAC AACAGGAATA CCAATTGCCG 4140 

TTGCTTTTTC AATAAATGGT GTAATATGCG CTCCTTCTGC TTTAATTGCA TTTGCATTCG 4200 

TCTCCTGATA AAGCTTTAGA GCATGATTTA AGTCTTGTGT CATAGAGATG CCTACTGCAC 4260 

CAATCGGCAT ATCAACAACT ACAAATGTAT TTGGTGCGCC TCTTCTTACT GCACX3AC0GT 4320 

GATGAATCAT ATCTGCTAAC GTCACTTGTA CGGTACTTTC ATAACCTAAT ACAGTCATAC 4380 

20 CAAGTOAATC CCCAACAAGA ATCATATCAA TACCCOCTGC TTCCACTTGT TTAGCACTTG 4440 

GAAAATCATA AGCTGTTACC ATAGAAATTT TAGTTTGCTT TTGTTTCATA TCTATTAATT 4500 

GACTTACTGT TTTCAATGTT ATTCAACCTC TTTTTGCAGT ATnATTAGA 4549 

2S (2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8339 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 



40 



TTATCTTTTG 


TTGTTTCCTT 


AGACAAACGA 


CTAACCACAT TATAATGGAC 


TAATTTATTA 


60 


A'rrPiATTTA ATTCCATTAA GTTATCCGTA 


ACACTAAGTG AAGATGCGGA 


GTTCACTCTC 


120 


GTTTGTACTC 


TTCGTTTTAA 


TAAAGCACCT 


CGTAATAATA CAATCATTCT TCTTATTAAT 


180 


GATGCTTGTC 


TATATACCTG 


TGTTCTTTCA 


GCATAACGCA TATAGTTTTC 


AAGTACACTA 


240 


TTCGTTATTT 


GTCCTTCATC 


TACTAAAGAC 


TCTAATGTTT TkGTTTCTAC ATTAAAAGCA 


300 


ATTTTritiTA 


GACGTTCTAA 


TTCTTTAGAG 


TTTTCATCAT CTTTCTCTAC 


AGTTTTTAAA 


360 


AATGCTAATT 


TATCATGATA 


TTCTTTAATC 


ACGTTACCAT ATTTAAAACT 


TGTTTCGAAA 


420 


GTAGATTTTT 


GATTTAGATA 


ATCAATAACT 


TGTTCTAATA TATAAATTCT 


AGCAACTTTA 


480 


AACGACATAT 


TGCCAATTAC 


TGTTTTAGGT 


GCAGGTTTCG TTAATAATGG 


CAATAATACT 


540 


TGCGCAACTA 


CCAAACTAAT 


AATAACCATA 


CCAGATGCAA TAAATAATAA 


GTCGTTTCTA 


600 
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ATTGTTCCAT 6CACACCACA TAACGTCATA 
TTCTCAGTCG TTGGATTATC ATCATCATTT 
^ GCTAAATAAA AATAAGGATA TAAGACATAA 
AAAGCAACAA CAATAGTGAT GCCTATTAAA 
ATTTTAATAA TAACTTCAGG TACTAAAAAT 

10 

ACATAACCTA GTATATTCCA TGTATGATTG 
ATAATTCTGT CACGTTCGAA ACCATGTACA 
GATGCGTGaA ACAATTCAGC AATTAAATAC 

IS 

GTAAACATAT TAATGTTTTC ATATCCTCGA 
GCCATACCTA TAAGTAAACC AACCACTGCG 

20 ACAGCATCAA CAA6TGAAAA AGCACCT6TA 
ATAATACCAG CAGCATCATT CAATAATGAC 
GGCAAGACCT TTCCTTTAGT GATTGCTTGC 

2S GCAGCAATTG CAAATGCTGC TCCAATAGGT 
CCTACACCTA TCACAGTAGT AATGACTAAT 
TATTTCCTTA AATGGACTCT AGAAACATTA 

^ ATTGTTACCA TAAACAATTC AGAATCAAAA 

AGTAACATGC CCAAGAAAAT TTGTATAAAT 

ACAAACGAAC TTAGTATCAC AACAGCTATA 

35 

CTTTCACCTC TCTAAAAAGT ATTGTTTAAT 

TATACTTTAG AGGATAAATT GAGTTAGCGA 

CTACGATTGC AGTACTTAAA TTTGCAATTA 

40 

ATAAATAAAC AACTTGCTTT CACATAACAA 
TTAAAATCX3A CTAACCAATT TCaAAGTACT 

4S TCaAAAGCAT TGTTATGCTT AACAATTTAG 
TTACGATGCT CTTCTCGTTT TTCAGCACGT 
TTTGAACTTA ATAATATTGA TGCATGTGTG 

^ CCGTTGCGAT AAGCAGCGCG AGCGACTAAG 
AAAACAAGTG ACAGTAATAA ACGCACACTG 



ATTAAAGCGT ATAAACTTCG CTTTGGTGGT 720 

TTAGTCATCA TTTTTTGGAA TGGACTGATG 780 

ACCCAAACAA ATCTAAATAG ATAGACAGCT 840 

AAGATTAAAT TGTGCGGTTC TGTTTTGATA 900 

CCTAATATTG AAAAAACAAA GCCATTTAAA 960 

TAACTCATTT GCAGTTGTGT ACGTACTTGC 1020 

AGTCCTGCAA CTACTGCTGC AATGATTCCT 1080 

GTAACAAATG GTGTTAACAA TTGAATAATT 1140 

CX3CATCAATG TTAATCG6AA CCTTACTAAT 1200 

CCACCAATTG ATGCAATTAA AAACAACTGA 1260 

ACTAATACTC CAACAGCTAT TTTAAATGAA 1320 

TCACCTTCAA GAATTGTCAT TGCTCCTTTT 1380 

ACTGCTACTG CATCAGTAGG ACAAAGAATG 1440 

AAATCTGGCC AAATCCAATG AATAAATAAA 1500 

CCTAATGCCA TCATCATCAC TGGCTTAATA 1560 

ACACCTTCTA CAAATAACAA AGGCGCAATC 1620 

TTAAATTGAA CAGGGATTGG GGTAATAAAT 1680 

GCTAGGGGTA CITTAGGTAT GAAAGTATGG 174 0 

AATATAAGAA TTGTTTCAAA TATTTCCAAA 1800 

TGAAAATTAA GTATCACATC TCGTTGTAAT 1860 

CCACAAAAGC ACTTTAATAT AGATATATGT 1920 

TTTAATTTTA TTTTATCACT AATTGTTTGT 1980 

CATTAACTTA TAATACAAAA AATGAGCACC 2040 

CTTTTAATGA TTAATTTTGA AAACAGATTT 2100 

CCAACACTTC AATCGTTTTG ATACCATTTC 2160 

AATTGTAATG CTTCTGTAGA GTTTTGTTCA 2220 

TGAGCATCAT TT T TTCGATA CATATAAGCG 2280 

TGCATGCCGA CTGGTGAAGT TAAATTAATA 2340 

AAAAATCCTG TATTCACAAT AAAATAAATT 2400 
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CTTAAGAAAA CATCTTGGAA TTTCACGATA 
CCTAACAACA ACATCACAGC AGCAATAAGA 

5 

CACATGCCCC CCACCAATAA AGCGTGATAT 
AATGAGCATG ATTGAATCTA AGAAAGAAAC 
AATTGACATT ACX3ACAGCAC TTGTTGTATC 

10 

TCCCTTGATT AATCTAAATA AACAGATGAT 
AATAATCATA ATATGTGTTA TTGTTTGTAT 
TAATGCTTAA TACTTCTTAA CAAACTATCT 
ATAAAAAACT TTTTAGAGTC TTGAGAAATT 
ATTAAAATTG TTAAAAATGT TATTGACCAA 
20 AATCCAGGGT TCATATCTTT TGTTTTAAAA 
GTTATTAATT GATATAAATA AACACCTAAA 
TAAAAATCAT CGCTGAAAAA CCTGTGTAAT 
CCAGAAAAGA AAGTCGAGAA TTTAAAATGA 
GCAATGATAA TATTTAAAAC TATTTGATTC 
ACAAGTTTTT GATATAATTG ATCACTCGTG 

30 

AACACAACAG GTGCAGCAAT TCCGATTGCG 
CTTTTTCGAT ATAGCGGGAT TTTCTTAAAA 
TACATATAAA AAAGTATCCT AAATAAACTG 

35 

AACGCTAGTC CAATATAATT GCCATTTTGC 
GGA^AGCCAC TAAATGGAGG CACGCCGCCA 
CCAAATAAAG GTTCTTTTTT AGCTAAGCCA 

40 

ATGTAAACTA AACTACCAAT AATAAAAAAT 
AAATAAAATA TTGCACCATT AATACCTGCA 
45 AATCCTATTG AGATTATGAC TTGGTAAGCT 
ACACCTATAG CGCCGATGAC CATAGTTATA 
AGATCATTAT GTTGATCAAA TAGTAAAGTG 
TTGGTCATTA ACGCTGCAAA TAATGCTGCA 
GGTAGCCACA TAAAAAGGAC CAGCGCTGCT 

55 



CCTATTGCAC TAATAAGAGC AATAAAACTA 2520 

CTAAAGATTT CTTTTGTTAT TTCCATTAAA 2580 

TGAAACAGAA CTTACAAAAG ATATAATGGC 2640 

GGTGCCCATA AGTACACTTA ACACACCCAC 2700 

AAATGTAACG ACACGATCTG CTGTTGTAGG 2760 

TAATGCAATT CCAAAAATAA TGAGTGAACT 2820 

CATCGCGACA CCTCCAATAT TAAGTCTTCA 2680 

rrrrcrm'T ctgacacgtc gatactatga 2940 

CGTATTACTQ TAGACCCTGG AGTTATAATA 3000 

TCACTTGTTA GTCTTGTTTC ATAT6AAAGT 3060 

AGAATATAAT TAATCGTGCT AATGCTAGAT 3120 

AATTTAATAG CTACCCATAT TTTTCTAACA 3180 

ATATAAATGA CAATTAAACC AATTAGATAT 3240 

TCTTCATCTT GAAATAATAC CCATAAGAAT 3300 

ATTTAGTCCT CTCCTTTCAA ATGCGGATTT 3360 

TTCAACTCAG TTGCATCACT TGTAACATTT 3420 

ATAACCACAA CTACTAAAAT ACTTAAAATT 34 80 

TTAACTTCCT CCCCATCTTT ATCTCCAAAA 3540 

TACATTGCAA TTAGACTAGT AATAATCATT 3600 

AATGCACCTT GGAAAATAAG TACTTTCCCC 3660 

ATAGCAAAAA TCATTATAAT AAACGCAACT 3720 

TTCAAATATT GATATTGTCG ATAGCCTGTA 3780 

AGCAATGTTT TTACAACAAT GTCATTTACC 3840 

AACGTGTTTG TTCCTAAACC TAAAATGATA 3900 

GCAATCTTTT TAATATCTTT ATAAGCAATG 3960 

GCAGCCATAG TTGCTAGCAA TGGATGTATG 4020 

AAGAATCGAA TTAATGCATA GGCCCCTACT 4080 

AGCTCAGTAT TTAACACAGC GTAGGCTTTG 4140 

TTCGCACTAA ATGCGACTAA GAAGATTAAT 4200 
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AAGTTTAATG TACCTACTGT TTTATAAAGT 
CCAATAATAT TCAAGACAAC ATAAATAATT 

5 

AGTGTAATGA GTACAAATGA CGCTAGTAAC 
AAATCTGATG TTAGAAAAGA GCCTATCACG 
AAGTGATAAC GATTTGCTTT ATGTTCGCCA 

10 

ATCACAAACG AAGCGGTTGT AACCATAATT 
CCAAAGGGCG CTGACCATCC TCCAAAGTCT 
ATTAATAGCA TTAATGAAAT AATTGTGGTG 
GAAATACGAT CATTATTTTT TAAAAATACA 
ATTGGTAAAA TCAATAAGTT ACTTAGCATC 

20 TTCTTTTGTT ACTTTATAAG TTCTATAAAC 
CCCTATAACT ATTGCAGTTA GTACAATAGC 
TCCACCAGTT ATTAGTGGTT CTGATCTACT 

25 GAGATTACCA GCATGAGTAT ATATTGAAAT 

TAAAATCATA TATGTTCCTA TAAACACTAA 
CATGATCGAC CTCCGCTAAG CGACAACATC 
AAAATACCTA ATTCAAAAAG TGTTATTGTA 
ATCCAAGTTG TTTCATATTG AGACAAAAAT 
GTAATAGATG ATACCAATGC TCCAATAATC 

35 

TCTAAAACCT CTTCAACATT AAAAGCCAGA 
AAACCACCAA TAAACCCACC ACCAGGATTA 
AAAGTCAATA AAATAAATAC AACAAGTTTC 

40 

TTCATCTTGT CCCCTCCGAT CTTGATAATT 
AATTAACACT AATCCTTCAA ATAATGTATC 

45 TACAATATTT TTACCACCTG TTAGTTTGTC 
TAAACCATCT GTTTGTTGTG TAATAAAAAT 
TGATACAGAA ATTTTAATTA TTTCTCTTTT 

SO TAATCTTGAA AAACTGACAA TAAATAGTAT 

CAATGCTAGA TCAGGGGCTT TCATTGCTAT 

55 



AAACCTATAC CTAATAAGAA TAGCCATGAA 4320 

GCAGCACGTA ATTGTTCTAC AGATTGTCCA 43 80 

ATAATTTCAA ACATGACGTA TAAATTAAAT 444 0 

CCAACACTTA AAAATAATAT GAACGATGGC 4500 

CGCCCAAATC CGTATGCCAT AATTAAAGTA 4560 

AAACTTAAAG AATCTCCTAA AAACTGTATA 4620 

AGCGTAATTG GACGGTGACG CTGAACATAA 4680 

ATAGTCATTG TACCTAAGTA TAAATATTTA 4740 

AGGATTAAGG CACAAAGGAA TGGTAATAAC 4800 

ATCTTCCCCC CTTAGGCCTT CAATTTCATC 4860 

AAGTACAAGT AAAAAC6CAG TCATCCCAAA 4920 

TT6TAACAAG GGATCAACAA ACAATTGGTT 4980 

AGAACCATAC GTTCCCATAC TCATAATAAT 5040 

TCC6ATTACA ATACGAATTA AATTGATTGA 5100 

AAATCCTATA ACTAGTAATA ATATTAAATT 5160 

ACTGTGACAA TAACACCAAC AACTGAGAAT 5220 

CTTACATGAA TTTGTCCTAA AATTGGAAGT 52 80 

GGTTTTCCAA AAAACATAGG TATTATCGCA 5340 

ATTAAAATTC TAAAATCAAT CGGTAAACTT 5400 

AACATTAAAA TAAACGCTGA ACTAAATATT 5460 

TTATGACCTG CGAAGAAGAC ATAGAATCCG 5520 

GTGACCGTTC TTAACACGAC ATCATTCTCT 5580 

TAATAATGtg TAAATACCTA GCCCAGTAAT 5640 

TAATGCTCTA AAGTCACCAA GTATCGCATT 5700 

AGCTTTTAAA TAAAAGTCTG ATATTGATGA 5760 

TAATGATACA ACAATAAGTG CCATCAAGAG 5820 

TTTGTTAGCG TTAGATCTTG GCACGTTTGG 5880 

CGTCGTTATT GTTTCAACTA CTAGCTGAGT 5940 

AAAGAATAAG GTCACAACAA ATCCGATGAC 6000 
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GACAGTTACG ATTGCTAATA TAATTTCTAA TGCCCCAAAT TCAGAAACAT GTAACTGATG 6120 

TACTTTAGGA AGTCCaATTC GAATATAACC ATATCCAATG ATAATCATAA ATATGCCTAA 6180 

GGTCATAATA ATGTACTGGT TTAAACGATC TTGCATAACA CGTTTA/^TC GCTTCGTAGC 6240 

AAACTTTTCA AAATGTCGAT ATACCATCTC ATAGCTTTTT GAAACTGAAA TCTGTCTAAT 6300 

TTTACCTGTG AACACTTTTT TCCAATCTAC TTTGATTGCT AGTACACTAC CCAATAAAAT 6360 

AATGATGATG GTTAAAAGAA GCGGTATGTT AAATCCATGC CATTGCGAAA CATGTGGTGC 6420 

CAATTGATCA ATTTGATGAT TACCACCTGA TACAGCTCTT AATGCnAGAA CGATAATCCC 6480 

CTTCCCAAAT ATATtiTGGTA CAAAAAAGAT TACAGGTACT AGCACCATTA aTATAAGAGA 6540 

TGGTAAACTA aACAACCATG 6TTCGTGGAT ATTTTTTTTA GTAAAAACCT TAGAATCATA 6600 

TTTTGtCCAA AATACTTCTT TTACCATGTA TAGTGCATAT GTGAATGTAA AAACACTCXK: 6660 

AATAACACCA ACAAACACGA TAGCTATCAT TGAAATCAAA CTAAATTGGG ATAATTGTCC 6720 

AGTTTGTGTT AATGCATCTA AAAACATTTC TTTACTTAAA AATCCATTTA AAAATGGTAC 6780 

TCCAGCCATA GATAGAGCCG CTATCGTCAT GACTAGATTC ATTTTAGGAA ATAGTTGACG 6840 

25 CATTqCACTT AAAATTCGTA TATCCCTTGA ACCTGCTTCA TGATCTAAAA TACCTACTCC 6900 

CATGAAAAGC GCAGATTTAA AGATGGCATG ATTCATTAGa TGAAATAGcG CACCArATAA 6960 

TAQnAATACA TAAATaGATG CTATTGCGTC TTGTTGGTGT TGAGCATATC CGCCACCTAT 7020 

ACCCACCATA GCCATAATCA TCCCAAGTTG ACTGATTGTA GAGTACGCTA GGATACCTTT 7080 

TAAATCCCAT TGTTTTAAAG CTGTAATTGA ACCAAATAAC ATTGTTATTA AACCAACAAA 7140 

CGTAACGATA TATACGTACA TATTGCTAnG ACCTAATAAT GGTGTAAATC GAAGTAATAG 7200 

AAnGATACCA GCTTTTACCA TCGTGGCTGA ATGTAAATAA GCACTTACAG GTGTAGGTGC 7260 

AGCaTTGCT CTAGGTAGCC AGTATGAAAT GGAraTTGTG CTGATTTTGT AAATGCACCT 7320 

AATAAAAACA TAAAAATCAT AGGGATAAAC AATCCATGAT TCTTAATATG ATCTGCTTGT 7380 

CCTAATATCT CTGTGATGIT ATTCX3TTCCT GTCAT6ATAT ACAGCATAAT AAAACCAACT 7440 

AATAACGCCA ATCCACCAAA TACTGTAATC ATAAATGATT GAATCGCACC AAATTGACTG 7500 

TCACCATTGT TATACCAATA TGAnATCAAT AAAAATGATG ATmCACTCGT TAATTCCCAA 7560 

AAaATGTACA TCmATATCGT ATTGTCTGAT AATACaaTAC CAATCATACT GAACATAAAT 7620 

AACGTTAAAT AAAAATAAAA CCTTGGTAAA TTGTCTTTTC GAGAGGATAA ATATTGAGTT 7680 

SO GCATAGAAGA ATACTGCAAT TCCAATAAGT GAAATAATAA GAGAAAACAT TAAACTTAAA 7740 

CCATCTAAAC GTAAATCTAA ATTAATATCT AATGTCTTAA TCCATGGAAT AGAGQTAGAA 7800 
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GGTGCAACCA ACGCTATGTA CCCGGCATAT TTAGCCAATG CTCTACGTTT AGACATTAGA 7920 

AGTATCATCG CCATAATCAC AAGTATAGCA ATTAATAAAT AAACCAAACT CATTATTAGC 7980 

5 

CTCCTTTGTT TCTATAATTG TAATGAAATA TAAATACTAT GTTCACACTC ATTTTCTAAA 8040 

CCGATAAAAT TTAGTGTTTC AATAGCAGAT TGATGCCCTA AATACTTTTG AATGACTGGT 8100 

ATAAGTATAC CTTTTTGATA AGCATGATAT GCAAATGTCT TACGCAATGT CGTTAGTCCT 8160 

W 

ACATTATCTA TACCAGCTTC AATTGATGCT TGGTGAATTA TTCGATAT6C TTGCTGTCTA 8220 

GATAATACTT GATTTGTTCG TAGTGATTGA AAAAGAAOGT CTTCATTCGA AAGACTCXTG 8280 

TCCTCTATAT ATTGAAGTAG TTCTTTCGAT AATGTTTCTG GTAACCTAAT TTTAATCAA 8339 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 588 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

CCCGATTTTT TTACGTAATC TAATACATAC GGCAAAATCA ACTTTAATCA AAAAAGACTC 60 

ATACACAATG CCTTTAAAGC ACATGTATGA GTCCTTTTTA GTAGTTTATA TCAAAAAATA 120 

30 

GTTTAATGTA TAAATTAGTT TTTGTTTACA GATGCGTCGT AGATTGATTC TACAGCATCA 180 

CCTAAAGCTT TATCGAATTC TTCTTTAGAT TGATCAGCTC TTAAATCACT AGCTAATGCA 240 

CGTGAGAAAC TTGCGATAAG TTCAGCGTTA TCTTTAAGTA ATTCATTTGC TTTTTCTCTG 300 

3S 

CTGTAACCAC CTGATAATAC AACGACAOGA ACAACATTAG GATGTTCAGC TAACTCTTTG 360 

TATAAGTTTG GTTCAGTAGG AATTGTTAAT TTCAACATTA CTAATTGATC AGCATTTAAG 420 

^ CTATCTAAAC CTTTTTTAAG TTCAGCTTTT AATACTTTTT CAATTTCAGC TTTGTCTTTT 480 

GCATTAATAT TAACTTCTGG TTCGATAATT GGAACTAAAC CTTTAGCAAT AATTTGTTTA 540 

GCAACTTCAA ATTGTTGTTC AACAACGTCT TTGATACCTT GCTCATTT 588 

45 (2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2841 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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AAATACAACC AGGTGACGAT GTAGGTCGTG CATTCAGCTT TGAAACAACA GAATATATAT I860 

TAGATCAATT GCCATGTTGG CTAACGTATA CTAATGCTGA AACACACAAA GTTATCGATG 1920 

5 ATAATTTACA TCTATCTGCA ATGTATTCAG GGATGATTAA AGGAACCGGG CCACGTTATT 1980 

GCCCTTCAAT TGAAGATAAA TTTGTTCGAT TTAATGATAA GCCGCGACAT CAACTTTTCT 2040 

TAGAGCCTGA AGGTC6TAAT ACAAATGAAG TATATGTGCA AGGATTGTCT ACAAGTCTTC 2100 

CTGAACATGT GCAcGTCAAA TGTTAGAGAC GATACCAGGT CTTGAAAAAG CAGATATGAT 2160 

GCGTGCCGGC TACGCAATTG AATATGATGC 6ATTGTGCCA ACGCAGTTAT GGCCTACACT 2220 

TGAAACGAAA ATGATTAAAA ACTTATATAC TGCAGGTCAA ATTAATGGTA CATCTGGTTA 2280 

IS 

TGAAGAAGCA GCAGGACAAG GATTGATGGC AGGTATTAAC GCTGCAGGTA AAGTGTTAAA 2340 

CACAGGCGAA AAGATATTAA GTCGTTCAGA TGCATATATT GGTGTCTTAA TCGATGATCT 2400 

TGTAACTAAA GGTACTAATG AACCTTATCG TTTACTAACA TCACGTGCAG AATATCGTTT 2460 

20 

GTTAcTACGT CATGATAATG CTGATTTGAG ATTGAOGGAT ATGGGATATG AACTTGGTAT 2520 

GATTTCTGAA GAAAGATATG CACGTTTTAA TGAAAAACGT CAGCAAATTG ATGCGGAAAT 2580 

TAAGCGTTTA TCAGATATTC GTATTAAACC AAACGAACAT ACGCAAGCGA TTATTGAACA 2640 

2S 

ACATGGTG6T TCTCGCTTAA AAGATGGTAT TTTAGCTATC GATTTATTAC GCAGACCTGA 2700 

AATGACTTAC GATATAATTT TAGAACTTTT AGAAGAAGAA CATCAATTGA ATGCAGATCT 2760 

TGAAGAACAA GTAGAAATAC AAACAAAATA TGAAGGTTAT ATCAATAAAT CACTACAACA 2820 

AGTTGAGAAA GTTAAGCGTA T 2841 

(2) INFORMATION FOR SEQ ID NO: 178: 

3S (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3025 base pairs 
r (B) TYPE: nucleic acid 

(C) STRANDEDNESS : doiible 

(D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 



45 



SO 



ATCTAATTTC 


AAACCCGGTG 


ATAAATTGCC AAGCGTGACG 


CAATTAAAAG 


AACGTTATCA 


60 


AGTAAGTAAG 


AGTACTATCA 


TTAAAGCATT AGGCTTATTG 


GAACAAGATG 


GTTTGATCTA 


120 


TCAAGCACAA 


GGCAGTGGTA 


TTTATGTGAG AAATATTGCT 


GATGCCAATC 


GTATCAACGT 


180 


CTTTAAGACT 


AATGGTTTCT 


CTAAAAGTTT AGGTGAACAC 


CGAATGACAA GTAAGGTACT 


240 


TGTTTTTAAG 


GAGATTGCAA 


CX3CCACCTAA ATCTGTACAA 


GATGAGCTCC 


AATTAAATGC 


300 
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CGAATATTCT TATTATCATA AAGAAATCGT GAAATATTTA AATGATGATA TTGCTAAGGG 420 

CTCTATCTTC GACTATTTAG AATCAAACAT GAAACTTCGT ATTGGTTTTT CAGATATTTT 480 

S CTTTAATGTA GATCAACTCA CTTCAAGTGA AGCTTCATTA CTACAATTGT CTACAGGTGA 540 

ACCATGTTTA CGTTACCACC AGACTTTTTA TACAATGACT GGCAAACCCT TTGATTCATC 600 

TGACATCGTA TTTCATTATC GTCATGCACA GTTTTATATT CCTAGTAAAA AGTAATAAAT 660 

10 

ACATAAAAAC GTCTATATCC CAGTTATAAA CTGGAGTATA GACGTTTTTT TACGATAATA 720 

ACAATGGCTC AAATTGCTAT TATCTTGCTT AGGTTTTTCX? TTTTAGAAGA ATATTGCTAC 780 

AAAGACAGGC ACAACTGCTA CAACAACTAC ACCAACTAAC ACTAAAGCTA TACTTGCCAT 840 

IS 

TGATTCTTCT ACAGGTCCTA ATTCTTTGGC TGGTGCTACA CCTAATGTGT GACCACTTGT 900 

TCCAAGTGCT AATCCTCGGG CAATAGGGTT AGTAATTCGG AAAAGCTTTA AGAATTTATT 960 

ACCTAGGGCA TAAATAATGA CACCATTTAA AATAACTGCT AATGATGTTA ATTCTTTTAT 1020 

20 

ACCACCGATA CCAGCTGATA CTGGTAACGC AATCGCTGTA GTTGCTGCTT GAGGTAACAT 1080 

TQATAAAATA ACATCATTGG CAAATTGTGC TAACTTCGCA AAAGTTAAAA TAATTAATAA 1140 

CGCTACAACT GTACCGATAC CAATACCTCC GATGATACGA TGCCAATGTT TAACAAGCAC 1200 

TTCACGCTTT TTATATAACG GAATCGCAAA ACAGATTGTT GCCGGTTCTA AGAAGAAGTA 1260 

AATAATGTCT CCACCTATTT TGTAAGTCTT ATACGGAATG CCTGTTAAAT AGAGGAAGGC 1320 

30 CACACCAAAT ACCATACTGA CAAATAGCGG TGCGAATAAG AAGAAACGAT TAGTTTTTTC 1380 

AAATAATATG GTCGCTAAGA AAAATGGTAT AACGGATAAC AGTATTCCGA AGTAAGGTGT 1440 

GTTTaGTGCT AAGTGGTTAA TCaTGAGCTT GTGCCTCCTC TATTTTGATC TTTTTTGTGA 1500 

35 CTTTGTCACC TTTAGATCTC GAAGTAACTT TCATAATAAT TTgTGTGACA TAGCCAGTAC 1560 

AAAlfiAGTAA TAGTATTGTT GAGACGATTA TTAGTCCAAT GATTAAAAAT G6TGCTTGGC 1€20 

TAATGACACC TAAAGAGTTA ACAACTGAGA TACCGGCTGG TACGAAGAGT AAGCCAATGT 1680 

w TATTTGTTAG TGTCGTTCCT ACTTTTTCGA CTTCGCCTAA CTTAACAGCA CCAGTACATA 174 0 

ATAATACAAA TAATAATACT AAACCGATTA CTGATGCAGG CATAGGAATT GGCATAAATG 1800 

ATTCAATTAT TTTCGATACA AAGAGTACTA AAGCAATTAC AATGACTTGG TGAAAAAAGT 1860 

45 

GTGCTGGTTT TGATGCGTCT TTTTGTTGTT TCACGACCAT TGCCTCCTAC GTTTGATTTA 1920 

ACTAAAGTAT AGATGGCTCA CTTCGATTTG CGTGATTTTT AGTCCGAAAT ACAAAATATC 1980 

ATAGGTAAAA TGCATAAAAA AAAGGATTAC TGTTAAAGTA ATCCTATCGA CGCTTTAAAA 2040 

50 

TCTTTCATAA ATGAACGTCC AACTTGCATC TTGACACCAT TTGTCAATAT TACCATATAA 2100 
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5 



20 



25 



TGAATACGTA 


TAAAATAAGT 


GGGATTCAAT 


CGTTTTTCAT AACGATTCAA 


TGGCTCTGTT 


2220 


GTTTCGTATT 


TATGATTCGT 


TGTATGTATG 


GTTGTAATAC 


CATTATGTGT GCCAATCCCA 


2280 


ATAATATTTT 


GTTGCTTTAA 


CATGTGAATT 


TTATCGTCAA TTTCAACAGG 


TAAGCTTTGA 


2340 


TCAAAATTCG 


CCGACATATC 


ATTCGCAATT 


GCACTTGCGT 


TATTATCATC 


TTTGGCTTTA 


2400 


GTCGCACGCA 


CTTTATTGAC 


TGCTTGTTCA 


ATACGTTTTT 


GACCAAACGG 


TTTCAAAATA 


2460 


TAGTCTGTCG 


CATTTAATTC 


AAATGCCTGT 


ACTGCGTATT 


GGTCATGTGC 


AGTTGCAAAA 


2520 


ATAATCGCAG 


GTGGCTCTTT 


CATCTTTTGA 


ATCTTAGCTC 


CTAATTCGAT 


CCCATTTTCA 


2580 


TCCATTAAAT 


TGACATCTAA 


AAATATAATG 


TCATATTGAT 


TGATCAGTAG 


TGCTTCCAAT 


2640 


GTTTCTTTTA 


CATTTTCTGC 


CTCATTAATT 


TCTTCAAAAC 


CACCAATTTC 


ATTTAATAAA 


2700 


TATGTTAATT 


CATTACGTGC 


TAATGGCTCA 


TCATCTATGA TTAATGCTTT 


CATATTTATT 


2760 


CCTCCTCTTG 


TCTTTCATJUV 


GC3AA6TACAC 


ACCAAAAAGT 


GGTACCGCTC 


GATGTCGATT 


2820 


CAAATTGTAA 


TGCTGCX5GAT 


TTTCCAAATA 


ATCCTTTTAG 


GCGTAAGTTT 


AAATTTTCTA 


2880 


AAGCACTACC 


AGTTCCAGAC 


TCTGATTCTA 


CAGATGTnTC 


TCCCaACAAA 


TGCATTTTAT 


2940 


CTTTAGAAAT 


ACCCT6ACCA 


TTATCTTGTA 


CAATAATACX? 


TACAT6TGTT 


GCAGTTTCTT 


3000 


TAATCACTGA 


CACGTCAATA 


TCGTT 








3025 



(2) INFORMATION FOR SEQ ID NO: 179: 
30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1689 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 

- (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

ACAGAATTTC ACAGCATTTT TAGATGAAAA AATAAGCCAG TCATAGCGTT GATTTAACAA 60 

40 

ATGAATATCA AAATTTAGTG GCTTTATATC AATAAAGGGT TTGTGAATAA TTGATACTAA 120 

ATCACTTTGC ATGTCATTTG TTTGTGTCAT AACTACAACT GGCTTCATAT TTAAACGTCA 180 

CTCCATTATT TAATGTTGTT CATTTAAGCG TTTTATAATT TCATAAGCAC CTTGCTCTTT 240 

45 

TAATTTGTTA CTCACTGTTT TGCCTAACTC AACCGGATCT GTTCCGTTCA TTGTATATTC 300 

AAATCGTTCT TTACCATCTG GGGTCATAAT TAAACCTGTA AATTCGATTT CGTrTTGATC 360 

TGAGATTGTA GCATATCCTG CAATTGGCAC CTGACAACTA CCATCCATTT CTGCTAAAAA 420 

SO 

CGTTCGTTCA GCAGTCACAC ATTTTGCAAC CTCATCATTA TGTACTTTGC TTAATAATGT 480 
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TAACAATGTA 


TCTCTATCAA 


GATAAGATGT 


TnCAATATCA 


TCTGACCAGC 


CCATTCTTCT 


600 




TAAACCAGCT 


GCAGCTAAAA 


TAATCGCATC 


ATAATCTTCA 


GTTTGTAACT 


TTTCTAATCG 


660 


5 


TGTATCTATA 


TTACCTCTAA 


TCCATTTAAT 


CTCTAAATTA 


GGATACTTAG 


ATAATATTTG 


720 




TGCACCACGA 


CGTAATGAAC 


TAGTACCAAT 


AATACTGCCT 


TCTGGCAATT 


GGGATAGTGG 


780 


10 


TGTATGTGTT 


TTAGAAATAT 


ACGCATCAAA 


AGGTAATTCT 


CTATCAGGGA TACAACCTAA 


840 


TGTTAAACCT 


TCCGGAATTA 


CACTTGGTAC 


GTCTTTAAGC 


GAGTGTATTG 


CCATATCGAT 


900 




ACTTTTTTCA 


AAAAGTTCAT 


GTTGTATTTC 


TTTAACAAAT 


AAGCCTTTGC 


CTCCGACTTT 


960 


IS 


AGACAATTGT 


TTATCTACTA 


TACGATCGCC 


TTTCGTGaCA ATTTCTTTAA 


TTTCAATTTC 


1020 




TAGATTTrGGC 


TCGACAGCTT 


TTAATTTATC 


AATAAATTGC 


TGGCTTTGTG 


TTAAAGCTAA 


1080 




TTTACyTCTT 


CTGGAGCCAA 


CGACTrATTT 


ACGCATGTTC 


AATTCCTCCT AGGAACGGAT 


1140 


£1/ 


TGCTCTAGAT 


TATTTTCTCA 


ATTCACAAAA 


TGTGTTGCAA AAAATAAATT 


AATCATATTT 


1200 




AAGCAAAATA 


AAATAATGTT 


ATAGTATATT 


AAATATCTTG 


AATTCAACCA 


TTTGTTGATT 


1260 




CTAAGTAAAA 


TATAACTTCC 


ATATAATACT 


GTAATAATTG 


AAGAGAGTAT 


TACCTTCGGG 


1320 


2S 


TCAATGAATA 


TACGTTCACC 


AACTGAAATT 


ACACCCCACT 


GTGTACCTAA 


AATAATACTA 


1380 




AATATC3AGAA 


TTATCCACCC 


ACTTAACGTT 


GAGTAAAACA 


CAATT6ATTC 


AAGTGTAGCA 


1440 




ACC5CTACCAA 


TTCTAAAGTA 


TTTTTGATCA 


AAACGTTTTT 


CCTTCAAATT 


ACGGTATTGC 


1500 


30 


ATGATATACA 


GTAATGCATT 


GACAAAAGCT 


AAGGCAAAGA 


AGACATAACT 


TAACACAGCT 


1560 




AGACCGATAT 


GGACTAACAG 


TAACTCGTCT 


ACAACAGCAA 


TTTTCTGAAC 


CTTATTAGTA 


1620 




TAATGTGTCG 


GTTGAAATGT 


ATTCATCCCT 


AAnAGTGTTA 


ACCCTATTAA ATTCCAAGGA 


1680 


35 


AAAACACAG 












1689 



(2) -INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1209 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
nTGGnTGGCT TTTCCTATTG GACCAAATGG ACCnTTTACC TGGCCnTTCC CAGGACACCC 60 
SO CGCTTGTGCC CACATTCCAA TCGGAAAAGO TGTATGTGGT ACAGCCGTTT CAGAACGTCG 120 

TACACAAATT GTAGCTGATG TTCATCAATT CGAAOGACAT ATCGCTTGTG ATGCTAATAG 180 
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CGATGCCCCT ATAACGGATC GATTTGATGA CAATGACAAa GAaCATCTTG AaGCAATTGT 300 

TAAAATTATT GAAAaGCAAC TCGCATAAAA GGACATCAGC ATTTTCAATA AAGTGTTGAC 360 

5 

AGTTAGCAGG AAAATGTTAC AATAATCTTT GTGTGAATTA ACGAAAGTAG CAGTTGTATA 420 

TTATTGAGCG CTATGTTGTT CCCAATGCGG ACGTGTCACG TAACTGTCGC TATAAGGTGA 480 

AGACACATAA AACAATATAT CTTAGTAAGC ATGCAACACT CTTTTTTCTT TATTCATAAC 540 

10 

AACAAAAAAG AATTAAAGGA GGAGTCTTAT TATGGCTCX3A TTCAGAGGTT CAAACTGGAA 600 

AAAATCTCGT CGTTTAGGTA TCTCTTTAAG CGGTACTGGT AAAGAATTAG AAAAAC6TCC 660 

TTACGCACCA GGACAACATG GTCCAAACCA ACGTAAAAAA TTATCAGAAT ATGGTTTACA 720 

ATTAC?GTGAA AAACAAAAAT TACGTTACTT ATATGGAATG ACTGAAAGAC AATTCCGTAA 780 

CACATTTGAC ATOGCTGGTA AAAAATTCGG TGTACACGGT GAAAACTTCA TGATCTTATT 840 

2^ AGCAAGTCGT TTAGACGCTG TTGTTTATTC ATTAGGTTTA GCTCGTACTC GTCGTCAAGC 900 

ACGTCAATTA GTTAACCACX; GTCATATCTT AGTAGATGGT AAACGTGTTG ATATTCCATC 960 

TTATTCTGTT AAACCTGGTC AAACAATTTC AGTTCGTGAA AAATCTCAAA AATTAAACAT 1020 

25 CATCGTTGAA TCAGTTGAAA TCAACAATTT CGTACCTGAG TACTTAAACT TTGATGCTGA 1080 

CAGCTTAACT GGTACTTTCG TACGTTTACC AGAACGTAGC GAATTACCTG CTGAAATTAA 1140 

CGAACAATTA ATCCGTTGAG TACTACTCAA GATAATACGG TCAATACCAA CACCCACAAT 1200 

30 TGTGGGTGT 1209 

(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 698 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : double 
- (D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
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SO 



AAATCCCTTt GTtaAAgTsC AAAtTTTTCc 


AACrgCTTTA 


AtArGACCCA TATTACCtTC 


60 


TTGGATTAAA 


tCmAGGaATG AcATACCACG 


ACCaCGTATC 


TTTTAGCAAT ACTTACAACT 


120 


AAACGTAAGT 


TC6CTTCTGC AAGTCTTGAT 


TTTGCTACTT 


CATCACCTTG TTCAATACGT 


180 


TTGGCTAATT 


CGATTTCTTC TTGTGCACTT 


AATAAGTTAA 


CACGCCCAAT TTCTTTAAGG 


240 


TACATACGAA 


CTGGGTCATT TATTTTAACA 


CCTGGAGGGG 


CACTAAGATC ACTTGGATTC 


300 


AOTTTCTCGT 


CAGTATCTGA ACTATCTTTT 


TCATTAACTA 


GTGAAATATC ATTATCATTT 


360 
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GCAATTTCTT CATGACTTAA ATGACCCTCT TTTTTACCTT TTTCAATTAA TTGCTTCTTA 480 
ACATCTTCTA ATGTTAATGT CGGATCAATT GTTTGTTTTT TAATTTTAAC TGTGTTATCA 540 

5 

GACATGAAAC GGCCTCCCGA TTTTAAATAT GAACATTCGA AATTTATTCA ATATTGCTAT 600 
TTTAAACGAA ATTCTTAATT AATTCCATCC ATATTTTnAA TTTTATTTTA CAAATTGGGA 660 
ACTAAATCCC CAATATTTAT TTTTCAATAG TGGTGGTT 698 

10 

(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5147 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

ACTTGATGAT GTATACAATG TATTTCAAGA ATATTATCAA AAAACATCTA ACATTAAGTT 60 

TTGTAGAATT CACAATTCTA GCTATTATCA CTTCTCAAAA TAAAAACATC GTTCTTCTTA 120 

25 AA6ATTTAAT TGAAACAATC CACCATAAAT ACCCTCAAAC TGTTAGAGCT CTCAATAATT 180 

TAAAAAAGCA AGGCTATCTA ATAAAAGAAC GCTCAACTGA AGATGAAAGA AAAATTTTAA 240 

TTCATATGGA TGACGCGCAG CAAGACCATG CTGAACAATT ATTAGCTCAA GTGAATCAAT 300 

30 TATTAGCAGA TAAAGATCAT TTACATCTTG TTTTTGAATA ATATCTCTAT TACGCAAGTG 360 

TGCTGTATTC TAAAGTGCAC TTGTGTTTTC TATTTTTTAA TAAAACCTCA GCACATAATG 420 

AACAACTTTC TATTTTCTAT ATCACTTAAA ACCATTTCCG AAATTAAACC TCAGCACATT 480 

35 

CAAAGCCCCA CTTTATTCTT AAAAATATTT TTTAACTCAT ATGTATTAAA CCGCTTTCAT 540 

TATAAAAAAT ATCTCTATAT TtTATCTGtT TtTATTAATC GAAATAGCGT GATTTTGCGG 600 

TTTTAAGCCT TTTACTTCCT GAATAAATCT TTCAGCAAAA TATTTATTTT ATAAGTTGTA 660 

40 

AAACTTACCT TTAAATTTAA TTATAAATAT AGATTTTAGT ATTGCAATAC ATAATTCGTT 720 

ATATTATGAT GACTTTACAA ATACATACAG GGGGTATTAA TkTGAAAAAG AAAAACATtT 760 

ATTCAATTCG TAAACTAGGT GTAGGTATtG CATCTGTAAC TTTAGGTACA TTACTTATAT 840 

45 

CTGGTGGCGT AACACCTGCT GCAAAtgctG CGCAACACGA TGAAGCTCAA CAAAATGCTT 900 

TTTATCAAGT CTTAAATAT6 CCTAACTTAA ATGCTGATCA ACGCAATGGT TTTATCCAAA 960 

GCCTTAAAGA TGATCCAAGC CAAAGTGCTA ACGTTTTAGG TGAAGCTCAA AAACTTAATG 1020 

ACTCTCAAGC TCCAAAAGCT GATGCGCAAC AAAATAACTT CAACAAAGAT CAACAAAGCG 1060 
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AAAGTCTTAA AGACGACCCA AGCCAAAGCA 
ACGAATCTCA AGCACCGAAA GCTGATAACA 
^ ATGAAATCTT GAATATGCCT AACTTAAACG 

TAAAAGATGA CCCAAGCCAA AGTGCTAACC 
CTCAAGCACC GAAAGCGGAT AACAAATTCA 

10 

TCTTACATTT ACCTAACTTA AACGAAGAAC 
ATGACCCAAG CCAAAGCGCT AACCTTTTAG 
CACCAAAAGC TGACAACAAA TTCAACAAAG 

IS 

ATTTACCTAA CTTAACTGAA GAACAACGTA 
CTTCAGTGAG CAAAGAAATT TTAGCAGAAG 
AAGAGGAAGA CAATAACAAG CCTGGCAAAG 

20 

ACAACAAGCC TGGTAAAGAA GACAACAACA 
GCAAAGAAGA CGGCAACAAG CCTGGTAAAG 

2^ GCAACAAGCC TGGTAAAGAA GACAACAAAA 

GCAAAGAAGA TGGCAACAAA CCTGGTAAAG 
CTGGTGATAC AGTAAATGAC ATTGCAAAAG 

30 CAGATAACAA ATTAGCTGAT AAAAACATGA 
AGAAGCAACC AGCAAACCAT GCAGATGCTA 
AAGAAAATCC ATTCATCGGT ACAACTGTAT 

^ CGTTATTAGC TGGACGTCGT CGCGAACTAT 
ATTTTATCCA AACCAATTTT AACTTATATA 

TAAGAATCAT CTAAATGCAC GAGCAACATC 

40 

TTACTTTTCT AAACAACTTC TGAAACGCCT 

CATTTTTAGG CATTAAAAAA TCGAACTAGA 

GATTCATGAA TAATTAGATT TAAAATGTCA 

45 

TTAGAATATT AACGTTAGTA TAAACGTCCA 
GTATTTTAAC GTCATTTTTA ATAATGCAGA 
CGCAATGGCA ATTGATTGTG GTGAAATAAG 

SO 

TGCCACAAGT AATGAACCGC TTGTTGAAAT 



CTAACGTTTT AGGTGAAGCT AAAAAATTAA 


1200 


ATTTCAACAA AGAACAACAA AATGCTTTCT 


1260 


AAGAACAACG CAATGGTTTC 


ATCCAAAGCT 


1320 


TATTGTCAGA AGCTAAAAAG 


TTAAATGAAT 


1380 


ACAAAGAACA ACAAAATGCT TTCTATGAAA 


1440 


AACGCAATGG TTTCATCCAA AGCCTAAAAG 


1500 


CAGAAGCTAA AAAGCTAAAT 


GATGCTCAAG 


1560 


AACAACAAAA TGCTTTCTAT 


GAAATTTTAC 


1620 


ACGGCTTCAT CCAAAGCCTT AAAGACGATC 


1680 


CTAAAAAGCT AAACGATGCT 


CAAGCACCAA 


1740 


AAGACAATAA CAAGCCTGGC 


AAAGAAGACA 


1800 


AGCCTGGTAA AGAAGACAAC 


AACAAGCCTG 


1860 


AAGACAACAA AAAACCTGGT 


AAAGAAGATG 


1920 


AACCTGGTAA AGAAGACGGC 


AACAAGCCTG 


1980 


AAGATGGTAA CGGAGTACAT 


GTCGTTAAAC 


2040 


CAAACGGCAC TACTGCTGAC 


AAAATTGCTG 


2100 


TCAAACCTGG TCAAGAACTT 


GTTGTTGATA 


2160 


ACAAAGCTCA AGCATTACCA 


GAAACTGGTG 


2220 


TTGGTGGATT ATCATTAGCC 


TTAGGTGCAG 


2280 


AAAAACAAAC AATACACAAC 


GATAGATATC 


2340 


CGTTGATTAA CACATTCTTA 


TTTGAAATGA 


2400 


TTTTGTTGCT CAGTGCATTT 


TTTATTTTAC 


2460 


CAACACTTTC TACTCTGATT 


ACATATATGA 


2520 


CAAGATGCTC ATTGCATTTC 


GTACTAGTTC 


2580 


TTTGAATCCA AGTGACAACA 


TTATTTATAT 


2640 


AACACAAATA AAAGCAACAA 


ATATAATACT 


2700 


TTCTTCACCA ACTTTTTTAA 


CAGCTGCAGT 


2760 


TTTCGCTGCT ACACCACCTG 


CAGTGTTAGC 


2820 


TTGTTGTGCC ACTGTCGCTT 


GAATAGGTGC 


2880 
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TGGAGAGAAT AATGGGAAAA TTGCTCCCGC 
CAAACCACCX5 TATGTCATAA CTTTAGCAAT 
^ TAACCATAAT TCTTTAATTG CTTCGACCAA 
CGTAATTAAA ATTGTAATAA TTACTGTTAA 
ATCGAGACGC AACGCAATTC CTTTAGGCGA 

10 

TTTTATTACT AAACTTTCAA GTGCACCTCC 
ACTCCATACT AATACAAAGG CAGTTAAAAT 

j5 TTTAGGCGTT CGTTTTTGAA TTTTATGTTC 
TTTAAATTTA CGACAAACAA AT6CTAACAC 
TQCTAGTTCT GGACCATGGA ATATTGTTAA 

20 CACTGITAAA ATGACAGGTA AAATTTCTTT 
TAAAACAAAT GGAATAATAA AGTTTAAAAT 
ATCTAATGTT GTAACGCCTC CACTTAAGTT 

25 

AATTGCACCA AAGGCACCCG CCGCACCATT 
TGGTTCAAAT CCAAGTTGAA TTAATAATAC 
TGCTGCACCT TCTAAAAATG CGTTGAAACA 

30 

GTCCACTGAA ATACTTGCAA TACTATCTTG 
AACTTTATAT AACCAAACTG CCATTAAAAC 

35 AACGCCTTCT GTAATCGCAC CTGCTGATAC 
CACAATCAAT GTAACAACCA AAGTTGTCAA 
ggtt£agcat AATAAAAATA AAATAATAGG 

^ ATTATCGAAT GGATTTACAG TAAGTAGTGT 

TTTATCATTC tgattaatct acaacctatt 

TAAAATGTAA CACTCCTATA T6TGACAGGC 

45 

ATACACAATA TTTAACTATA ATAmATAATA 
TTTATAATAA TCTCAGGAAT TCGCTTCAAA 
so AGAATCTCTC ATTTTATGAA TTGTAGGAAG 
GATAAT6ATA AATATCATAT TAAACCATAG 
CCAAATTTCT AATACTGTGA AGATAGACAT 
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TTTAGCAATA CCTTGTCCAA TTGCTACAGT 3000 

AGCTAGGATA GCTGAAATTG TAAGGATCGG 3060 

TAAAGCACCT GCACTTTTCC ATTTTAACTT 3120 

TAAAATCGCT GTCCCAGTTG CACCAATTAA 3180 

TAAATCACTC ACAGTATTTG GAATTGGCAA 3240 

AGGTTGGAAT AATTTTTTGA AGAATGGTGC 3300 

TACGAACGGA CTCCAAGCAA AGACAATTTC 3360 

AGACX5CTTCC AATCTGAAAA TGTTTTTCX3G 3420 

CACCATTGTT GCTAGTGAT6 GAATAATGTC 3480 

TAATAATTGT AATCCAGTAT ATGTACCACT 3540 

AATACCTTTC ATACCATCTA CAATGAATAC 3600 

TGG7VAGTGTT AATGCTGAGT ATCTCGCAAC 3660 

AAACGTATCA ATAATACTAA CTGGTAAACC 3720 

AGCAATTAAA CATAACATCG CTGCTTTTAA 3780 

TGCACAAATC GCAATTGGCA CACCAAATCC 3840 

AAATCCAATT AATAATAGTT GGATTCTTTG 3900 

AATAATAGAA AATTGTCCTG TTTTAATAGA 3960 

GATATATCCT ATTGGGAAAA TACCGGCAAC 4020 

ACGCGCTGGT AATTCAAATA CAAATAAAGC 4080 

TGCTGCATAA ATGCCTTTCA TTTTAAAAAC 4140 

TACTGCTGCA ACTAAGGCTG ATAATCCX3AC 4200 

CATAATGACT CCCTCTCTTT ATATAAAATA 4260 

TCAACTTATA TTTTGCGATG ATCACATATT 4320 

AATCGAATTT TTACAAAAAG TTCACAAAAT 4380 

TATCaTtXtTA ATTATAAATA CTAGATATTA 4440 

ACTGCATCAT GAGAGTTTAT ATTTTTATTG 4500 

TAAACAAAAT ATGACAAGCG TCAAACCAAT 4560 

TAAATTGAAT TGATGATGGT GTTGTATTTG 4620 

ATAGCTCATA ATCTCTAAAT TTAAOGTACT 4680 
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AAATCGTTCA TAGTATCTAC CTGCAATGAA AAATATAAGC CAAATCACTA TAAATGCGCT 4800 

ATTAATCAAA AGCAGCACCC ATTTATCAGC AAAATTATCA GCATCCCCTG CTAAATTATA 4860 

ATGAATAGGC ACTTTGGTTG GTAATTTTGG ATAGGTCACT ACTGTATAGC ACATCATAGC 4920 

TAAGTAAATA AGTAGACTTA ATATTGTAAA AGACCTGATT TTAGACATTC TATCGCCTcT 4980 

TcTTTACATT TTATGTATAA CACTCTGCCT ATTTTACCTT TTAATaCATT ACCCCAAcGA 5040 

TtAAaCAATA tGTAaTGATA CTATAATTGC GTCAGGAGTA TCCX3CTTGTT AAATGTGCAT 5100 

AGCTTATATT TAGCTGTTTA ACATGCCACA TAATGATTCG AATTATT 5147 
(2) INFORMATION FOR SEQ ID NO: 183: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1312 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : double 

20 (D) TOPOLOGY: linear 



25 



30 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

CACTTACTTC CACCATTATC ATAACTTTAA AATGGATATA nTTCATCAAA CATTATCTAA 60 

AGGCGTCGCA CCTACACCAA CACCATCCAA CAATTAACTT ACAACTCTGC GATTACTTCT 120 

TCAGCAGCAA CTTTCACnTG CGTAATACAA TCAGGTAGTC CAACCGCTTC AAAAGATGCA 180 

CCAGTTACTC TAAGTCGTGG ATATGTTTGT TTAATATGTG CTTGAATCTG TCTAATTTGT 240 

TGAATATGAC CGACATGGTA CTGTGGCATA CTTTTCGGCA AACGATTGAC AATTGTAAAT 300 

35 TCAGGATCAC CTTTAAATGT CATCATTTGA CTTAAATCTC TACGTACAAT CGATACTAAT 360 

TCATTATCTG TATGATCATC AACCACAGTA TCACCTGGTT TACCTACATA CGCACGAATC 420 

AAAffiCTTAC CTTCCGGTGT AGTAAATGGC CATTTTTTCG ATGTCCAAGT ACATGCGGTA 480 

^ ATGfCTGTAT CACTCGTTCT CGCAATTACG AAGCCAGTAC CATCATGGGT ATTTTCAATG 540 

TCTTTTTCAT CAAATGCCAA TACAACAGTT GCAACAGTCG TACTATCCAT CGTTTTAAAG 600 

TAATCAAATG CTGGATCTTG TCCGAACCAA TTTAAAAACA CTTGATGTGG TGTCGTTACT 660 

AATACGCCAT CATACACTTC TTCTAGTTGA TCATTGTAAA CAATTTTATA TTGTTTTTGA 720 

GATGTAATTA TATCATCCAC TGACGTATTG TAGCGTATTG TCACACCTTT ATTTTTAACA 780 

TCTTGTTCTA ATGCTTCAAT AAATQAGCTT AAACCATGCT TAAATTGTTT GAATTGTCCT 840 

TTCGGTGCGC CAGGATATAA TTGTCTTTGT TTCAGACGCT TATTTTTCTC ATCCTTCATA 900 

CCTTTTATCA GACTTCCGAA TGCCTCTTCT TTTTCTTTAA AATTAGGAAA CGTACTCATC 960 
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TCAAGTACCT CATTACCTAA TCTTGCTCTG AAAAATGCAC CAACAGAAAT GTCACCATCC 1080 

TGCATTTGAG TAGGTTTTTT TAATAAATCA AACCCTGCTC TTAATTTACC AAGTGGCGAT 1140 

ATTAATTTTG TAGTAACAAA TGGTTTAATA TCTGTTCGAA TACCCATAAT TGAACCACCT 1200 

GGAATCGGAT ATAATTTATT TTTCGCAAAA ATATATGATT GTCCAGTCGT ATTTGTAACA 12 SO 

ATATCTTGTT CTAATCCAAT ATCTTTCGCT AATTCTGTCA TAATCGTTTT TC 1312 
(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6157 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



20 



25 



30 



35 



40 



45 



SO 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

TTTTACAATA AAAATATGAT ATACTACTTG TCGTATATAA GGAACGGAGG ACAATTTATG 60 

CATACATTTT TAATCGTATT ATTAATCATT GATTGTATTG CATTAATAAC TGTTGTACTA 120 

CTCCAAGAAG GTAAAAGCAG TGGACTTTCA GGTGCCATCA GTGGTGGTGC TGAGCAGTTA 180 

TTCGGTAAAC AAAAACAACG TGGCGTCGAT TTATTCTTAA ATAGATTAAC AATTATTTTA 240 

TCAATATTAT TTTTTGTACT TATGATTTGC ATAAGTTATC TTGGTATGTA AGGTCCGGCG 300 

ATGTAAATGT CGGGCTTTTT TATTTATAAT TAAGAATGTA ATAGTTTAAC AATAAGCTAT 360 

GTAAAATATA TAGCCTAGTT AAGTATGCAA AGGGAGCGTT AGATTTATGC AGATAAAATT 420 

ACCAAAACCT TTCTTTTTTG AGGAAGGTAA ACGTGCCGTG TTATTACTAC ATGGTTTTAC 480 

AGGCAATTCG TCTGATGTTC GTCAATTAGG TCGATTTTTA CAAAAGAAAG GTTATACATC 540 

ATATCCACCG CAATATGAAG GCCACGCGGC ACCACCAQAT GAAATACTGA AATCTAGTCC 600 

TTTCGTTTGG TTTAAAGATG CGTTAGATGG TTATGATTAT CTTGTTGAAC AAGGTTATGA 660 

TGAAATTGTT GTTGCTGGTC TATCATTAGG TGGGGATTTT GCTTTAAAAT TAAGCTTAAA 720 

TAGAGATGTA AAGGGTATTG TAACGATGTG TGCTCCTATG GGTGGCAAAA CTGAAGGTGC 780 

CATTTATGAA GGCTTTTTAG AATATGCACG CAATTTTAAA AAGTATGAAG GTAAAGATCA 840 

AGAGACTATT GATAATGAAA TGGATCATTT TAAACCAACT GAAACTTTAA AAGAACTAAG 900 

TGAAGCATTA GATACGATTA AAGAGCAAGT TGATGAAGTG TTGGATCCTA TTTTAGTGAT 960 

TCAAGCAGAA AACGACAATA TGATTGATCC ACAATCCGCA AATTATATAT ATGACCATGT 1020 

AGATTCTGAT GACAAAAATA TCAAGTGGTA CAGTGAATCT GGACATGTTA TTACGATTGA 1080 



868 



AGAATAAAAA GAGATTTTAA CATTAGAAAG 
AGAAGAGATT ATTAATCAAC CTGAATATGA 
^ ATTAGGTTTA AGCAGTGCCG ACTCGTTTAG 

ACAATCAGGA TTAATCGAAC GTACAAAAAC 
AGGTCAATCA AAATTGATAA AAGGAACX3TT 

10 

AAGACCTGAA GATGAGGATA TGGAAGATAT 
CTTGGATGGA GATACTGTTA TTGTAGAAAT 
AATCX3AAGGG 6AAGTTAAGT CGATTGAGAA 
TAGTGAAGCT AGACATTTTG 6CTTTGTTAT 
TTTCATTCCT AAAQGTCAAA GTTTAGGCGC 
20 TACTAAGTAT GCTGATGGTT CAGATAATCC 

TAAAAATGAT CCTGGCGTAG ATATTTTATC 
ATTTCCTGAT GAAGTGTTAC AAGAAGCTGA 
AATTAAAGGC CGTCATGATT TACGTGATGA 
TAAAGACTTA GATGACGCAA TTAGTGTTAA 
TGTAAGTATT GCTGATGTCA GCTATTATGT 

30 

ATATGATAGA GCGACAAGTG TATATCTTGT 
ATTAAGTAAT GGTATTTGTT CATTGAATCC 

35 CATGGAAATC GATGCTAGTG GTCGCGTTGT 

TTCTGATTAT CGAATGACGT ATGATGCGGT 
CATTCGCGAA CAATATAATG AAATTACGCC 

4^ TCGTTTGATT CAAATGAGAA AACGACGTGG 

AGTATTAGTT AACGAAGACG GTATACCAAC 
TGAACGTCTA ATTGAATCAT TTATGTTAAT 

45 

TAAGTTAGAT GTACCTTTTA TTTACCGAGT 
ACAATTCTTT GATTTTATTA CAAACTTTGG 
5^ TCATCCAACA ACACTTCAAA AGGTTCAAGA 

CATTTCAACA ATGATGTTGC GTTCAATGCA 
ACATTTTGGC TTATCAGCTG AATATTATAC 

55 
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GAGGGGCATA ATGAATTTAA AGCAATCTAT 1200 

ACCTATGTCA GTGTCAGATT TTCAAGATGC 1260 

AGATTTAATT AAGGTGCTTG TGGAGTTAGA 1320 

AGACAGATAC CAAAAAAAGC ATAGTTATAG 13 80 

AAGTCAAAAT AAAAAAGGCT TTGCATTCTT 144 0 

ATTTATTCCC CCGACGAAAA TTAATCGTGC 1500 

CCATCAATCA AAAGGTGAAC ATAAAGGTAA 1560 

GCATTCTGTA ACTCAAGTTG TTGGTACGTA 1620 

TCOGGATGAT AAACGTATTA TGGAAGATAT 1680 

AGTCGATGGT CATAAGGTAC TTGTAGAAAT 1740 

AGAAGGACAT ATTTCTGCTA TTTTAGGACA 1800 

TATTATCTAT CAACATGGCA TAGAAATTGA 1860 

AGCAGTACCT GATCATATTG AAAATACTGA 1920 

ATTGACAATC ACAATTGATG GTGCTGATGC 1980 

AAAGTTAGCG AACGGTAATA CGCAATTAAC 2040 

AACAGAAGGT TCTGCATTGG ATAAAGAGGC 2100 

TGACCGTGTA ATTCCAATGA TTCCACATCG 2160 

TAATGTTGAT CGTTTAACTC TAAGCTGTCG 2220 

TAAACATGAA ATTTTTGATA GTGTTATACA 2280 

AAATCAGATT ATTACTGAAA AGGATCCTAA 2340 

TATGCTAGAT TTAGCACAAG ATTTATCTAA 2400 

TGAAATCGAT TTTGATATTA GTGAAGCAAA 2460 

AGATGTTCAA TTAAGACAAC GTGGCGAGGG 2520 

TGCAAATGAA ACAGTTGCTG AACATTTTAG 2580 

GCATGAGCAA CCTAAATCAG ATCGCTTAAG 2640 

CATCATGATT AAGGGTACTG GCGAAGATAT 2700 

AGAAGTAGAA GGTCGACCTG AACAAATGGT 2760 

ACAAGCGCAT TATGATGATG TGAACTTGGG 2820 

GCATTTTACA TCACCAATTA GACGTTATCC 2880 
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AGAAGTQAAG CGTTGGGAAG ACAAATTGCC TGAGTTAGCT GAACATACTT CTAAACGTGA 3000 

ACGTCGTGCT ATTGAGGCAG AACGTGATAC TGATGaATTG AAAAAAGCAG AATATATGAT 3060 

TCAACATATT GGTGATGAAT TTGAAGGTAT TGTCAGCTCA GTAGCTAACT TCGGTATGTT 3120 

CATTGAATTG CCAAATACGA TAGAAGGTAT GGTTCATATT GCGAATATGA CTGATGATTA 3180 

TTACCGTTTT GAAGAGCGTC AAATGGCATT AATTGGTGAG CGTCAAGCTA AAGTATTTAG 3240 

AATTGGTGAC ACAGTTAAGG TTAAAGTGAC GCATGTTGAT GTAGATGAAC GATTAATTGA 3300 

TTTTCAAATT GTAGGTATGC CTTTACCGAA AAATGATCGA TCACAGCGCC CAGCGCGAGG 3360 

TAAGACAATT CAAGCCAAAA CGCGTGGTAA ATCATTAGAT AAATCAAAAT CTGATGATAA 3420 

GGGTCGTAAG AAAAAAGGTA AGCAACXSTAA AGGTAAAAAC CAACX3TAATA ATGATAAATC 3480 

AGGTAATAGT AAGCATAAGC CATTTTATAA AGATAAAAGT GTGAAAAAGA AAGCACGTCO 3540 

20 TAAGAAAAAA TAAGCAGCAA TGAGGTGAGT ATGAATGGCT AAGAAGAAAT CACCAGGTAC 3600 

ATTAGCGGAA AATCGTAAGG CAAGACATGA TTATAATATT GAAGATACGA TTGAAGCGGG 3660 

AATTGTATTG CAAGGCACAG AAATAAAATC AATTCGCCGA GGTAGTGCTA ACCTTAAAGA 3720 

TAGTTATGCG CAAGTTAAAA ACGGTGAAAT GTATTTGAAT AATATGCATA TAGCACCATA 3780 

CGAAGAAGGG AATCGTTTTA ATCACGATCC TCTTCGTTCT CGAAAATTAT TATTGCACAA 3840 

GCGTGAAATC ATTAAATTGG GTGATCAAAC ACGTGAGATT GGTTATTCGA TTGTGCCGTT 3900 

AAAGCTTTAT TTGAAGCATG GACATTGTAA AGTATTACTT GGTGTtGCAC GAGGTAAGAA 3960 

AAAATATGAT AAACGTCAAG CTTTGAAAGA AAAAGCAGTC AAACGAGATG TTGCGCGCGA 4020 

TATGAAAGCC CGTTATTAAG CGATTTAGTT GCTTAATCGG GCTATATTTG ATATAGTTAT 4080 

ATGTGCTTTT GTAAATTACA AAAGTATGAT TTGTTTGATT TATTATTTCG GGGACGTTCA 4140 

TGGATTCGAC AGGGGTCCCC CGAGCTCATT AAGCGTGTCG GAGGGTTGTC TTCGTCATCA 4200 

40 ACACACACAG TTTATAATAA CTGGCAAATC AAACAATAAT TTOGCAGTAG CTGCCTAATC 4260 

GCACTCTGCA TCGCCTAACA GCATTTCCTA TGTGCTGTTA ACGCX3ATTCA ACCTTAATAG 4320 

GATATGCTAA ACACTGCCGT TTGAAGTCTG TTTAGAAGAA ACTTAATCAA ACTAGCATCA 4380 

TGTTGGTTGT TTATCACTTT TCATGATGCG AAACCTATCG ATAAACTACA CACGTAGAAA 4440 

GATGTGTATC AGGACCTTTG GACGCGGGTT CAAATCCCGC CGTCTCCATA TTTGTAGCCT 4500 

ACAGCCTTTG TGGTTGTGGG CTTTTTTATT TTGTGTTTTT CAGGGGATAA TGCATTGCAG 4560 

AATTTGTTGT GAGTATTGAT ATAGCAGTGT TTGTATAGGT GTTTATTTGA TGGAGGAAAG 4620 

AGTAATAAGT GATTATGAAT TAGTTTTTGA GATATAAGGG GACAGTGATG TGTGTCAAAT 4680 
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TTATACGCAA AAAATTCTCC ATGTTATATA TGTCAATATA AAAATGTGAA TCGTCTACAC 4800 

TTAATTGGAT AAATGGCTAC TGAAAAAGAA CTTTTCATTT TTGTTACGTC ACTAAGTGGG 4860 

^ TGTAGTTATA AAGAGATGAG CCGAGTTTTG ATATTTTCAT TAGAATCAAT ATGCCTATTA 4920 

ACACAATCAG CAATAGTTGA CGAGACGGAA ATAAAAGAAG TCGTAGTTAA GAAATGCATT 4 980 

TCACAACATA CCATTGTAGC CATTTTTATT GTTTTGGATG ATAAACTCTT TTTGGAATTT 5040 

10 

TTAGTTTTTA TAATTTGCAA CTACACTACT TCTTTTACTA ATATTAATGT CTAAGTAATC 5100 

GATAAAAAAT TTTCCATTGA ATAAATGAGA AGTTAAAAAC TTTACTTAAC CTTTCycATT 5160 

GCATTTTCCT ATTCACGATT TTAAGAACCC AACATACTAC AAACGAATTT TAAAAGGCGA 5220 

GAGTAAAGCT TACTTGTTTA TTATACATAT TTAAAATCCA AGAGTCAGAA CAGACTACTC 5280 

CTCTTTATAA CTATAAAAAA TAGCTATGAA AAAATCTATC GTCATAGATT CCTTCATAGC 5340 

2^ TAATCTTAGT ATGTTTATTT TTATTTTAGG ATGCTATTTA TC7VACTCAAC ATATAACTCA 5400 

CTATTTTTAT AACCTTCTAA TATATCATTA ACTTGTCTAA TAGGTATTTC TGGTACTTCT 5460 

CTAATGTTTT CCAATTTTGT TTTAAATTGT TTTTTTGTTA TTTGCTCTTT ATTTGTAGCC 5520 

AATTGGAACA AGTAAGAATC TAGCATATTA ATTTCTTTAT ATGAATACAT ATATCTTAAT 5580 

AACACTAAAT CTCTAGTTTT TAAGTTAGGC GCTAGTTCTT CTTGTAATTG TTCTATTGAT 5640 

TGTyTCATTA ATAACAATCT CATTTCTAAT TCTTCATTAT TCATTTTATC ACACTCTTtT 5700 

30 

TATATTAATG CTTGACCAAC TTGGGAAACC CAAAACXTCTA TGCTTCTTGC AGTAGAATCT 5760 

TTAATACCAG TTCCCATCAA TGCTTGTGAA ACTTGACCTT GTACATTTCC CCATGTAGCC 5820 

35 TCTTCTTGTT TTAATGCATT ATTCAATGCG GGATTTACAA ATTTATCCCA TCTTTTTTTT 5880 

ATGATTTTCC GGCACGGGGA CTGATTTCTT TAACACCATT AAACACAGAT TTTTTATTTT 5940 

TAATSyVTAGC TTTATAGTAT CATGTTGGCT AAGCTATAAA TAAGTCAGTT TCTCTAAAAA 6000 

TTAAATAACT GAATGTAAGA CAATCAACAA wCCAAATTTA TACTTCATCT AAACCACTGT 6060 

GGTCGTCATC TTTTTGCTTT TCTTTTTCTT TCTCTCXjTTC TTGTTCTTTT TTGTACTCTT 6120 

CTTCAAATTC TTTTTCTTTC TTTTCTACTT CTTCTCT 6157 

45 

(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 884 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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CATTTGTTAT TCTGAGTAGC CAATTTGGCA AAC3ATGAACA 
AAGTTGCAGT CGCATTAGAG TTAATTCATA TGGCAACACT 
^ ATAAAAGCGA CAAGCGTCGA GGCAAGTTAA CCATATCAAA 

CTATTTTAAC TGGGAATTTT TTATTGGCAT TAGGACTTGA 
ATAATCGTGT ACATCT^TTG ATATCTGAAT CTATCGTTGA 

10 

TCCAATTTCA AGACCAATTT AACAGTCAAC AGACAATTAT 
ATCGCAAAAC AGCACTGTTA ATTCAAATAT CAACTGAAGT 
CT6ATAAAGA GACTGTACX5A AAATTGAAAA TGATTGGTCA 

15 

AAATCATTGA TGATGTATTA GACTTCACAA GTACCGAAAA 
GAAGTGATTT GCTTAATGGT CATATTACGT TACCGATtTT 

20 CAGACTTCAA ATTGAAAATC GAACAGTTAC GTCGTGATAG 

AATGTATCCA AATCATTAGA AAATCTGACA GCATCGATGA 
AGTATTTAAG TAAAGCyTTG AATTTGATTT CyGaGTTACC 

25 TACyTTTAAG TTTGACGAAA AAAATGGGTT CAAnAAACAC 

TTGAAAGCGC TTTACCAACC TGTTAATATA TAATAGTAAT 
(2) INFORMATION FOR SEQ ID NO: 186: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



- (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 



SO 



AATTTCATCT GCTCGTGCAA 


AATCTTTGTT 


TTTCCTTGCT 


TCATTACGCT 


CTTCGATTAA 


60 


TTTTTCAACA TCTTCATCCA 


ATAATTCATC 


TGCATTTTTA 


GATTTTAACG 


GTACACCTAA 


120 


AACATCGCTG AAAATTTGAT 


AAACTGCTTT 


AAATTTATCA ATTACTTCTG 


TTGATGTTGT 


180 


GTTCTCTAGT ACATATTTAT 


TCGCAAGTkT 


TGCTAAATCA 


TACCAAGCTG 


TAATTGCATT 


240 


AGCTGTATTA AAATCATCAT 


TCATAACTGT 


TTCAAAACGA TTTAAAATCG 


CATCAATTTG 


300 


ATCAATATAT GTCTGTTGAT 


TTTCAATATT 


AGTAGCAATT 


TGTGCGCGCT 


CTTCAATTAA 


360 


TTGATAACTA TTGCGAATAC 


GCTCTAGTcC 


aCTACGTGCT 


GATTCTACCA ATTCTAGATT 


420 


ATAGTTAATT GGGCTTCTAT AATGTACGCT 


AATCATAAAG 


AATCTTAGTA 


CATCTGGATC 


480 



55 



AACGTCTGAA CAAACGTATC 60 

TGTTCATGAT GACGTTATTG 120 

GAAATGGGAT CAGACAACTG 180 

ACACTTAATG GCCGTTAAAG 240 

TGTTTGTAGA GGGGAACTTT 300 

TAATTATTTA CGACGTATCA 360 

TGGTGCAATT ACTTCTCAAT 420 

TTATATAGGT ATGAGCTTCC 480 

GAAATTAGGT AAGCCGGTCG 54 0 

ATTAGAAATG CGTAAAAATC 600 

TGAACGCAAA GAATTTGAAG 660 

GGCTAAGGCA GTAAGTTCGA 720 

aGATGGACaT CCGaGAtCAC 780 

GTAGTATTTA TGnAAAAGTA 840 

ATAC 884 
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ATTATCAATA TTAATGAAAC CATTATGCAT 
TGCTTCTGAT TGTGCTATTT CATTTTCATG 

5 

ATGTATATCA ATTGTAGGTC CTAGCTCATG 
TCCTGGTCTA CCTTCACCAA ATGGGCTATC 
CACAATGTAA AATCAAGTGC ATCTTCTTTA 

10 

TTTAAGTCAT CTATGGATTG ATGACTTAAT 
AAGTAAACAT CGCCACCACT TTCATATGCA 

75 TGAATAAT6T CATCCATATG GTCCATTACC 
AACGCACCAA CATCTTCATG AAAAGCAGOG 
TGATTTAATT CTTGAGAACG TTTAATTAAT 

20 TATTCTACAT TATATCCTTG GTATTCAAAG 
GGTCTTGCGT TACCAATATG AATGTAGTTA 
ACTTTCCCTG GTTCTATAGG CTTGAACACT 

2S 

GTAATCATCT TGAATCTCTC CATTCCTAGT 
TTGTTCATAA ATTGGATCAG GTAGATGGCG 
ATCTTGCTTA PiQhP^PCKIQ CTGGTATACC 

30 

TAAAACAACT GAATTTGCAC CAATATTTAC 
TTTCGCACCG GCTGCTATTA AAACATTGTC 
55 TTTCCCTGTC CCACCAAGTG TCACGCCTTG 

TGTTTCTCCT ATTACAACGC CCATACCATG 
ACCTOGATGG ATTTCTATAC CTGTGAAAAA 

PAQKTKrrrr tttvggtvgt ataacttatg 

ACCTGCATAC GTTGTAATGA CTTCTAATGT 
CATTTTTATA TCGTCTCTCA TTCTTTTTAA 

45 

ACGTAAATAC ATAATTGAAG TACCTGCGAA 
CAAATTGTAT TGTTAGAGGC GCTTCCGCAC 
AATAATATTG CGGGCGCTTC CAAATTATCA 

SO 

TCTCATTATA TATTTAATTC ATTTTACGAA 
ACAGCAACaS TACACTCTCT OCATCGTATA 

55 



CCAATAAtTA GCAAATGGCG CATGATTATG 600 

ATGTGGAAAT TGTAAATCTG AACCACCCGC 660 

AAATGCCATT ACAGAACATT CTATATGCCA 720 

CCAACTAATC TCGCCAGGTt CGCTTTTTTC 780 

TGCTCTCCTG CATCTATACG AGCACCCACT 840 

TTACCATAAC CTTCAAATTT ACGTGTTCTA 900 

TAACCTTGAT CCACCAAATC TTTAATAAAT 960 

CTTG6ATTTG AAGTCGCTTT TCTAACATTT 1020 

ATATATTTTT CTGCAATTTC GGGAACAGAC 1080 

TTATCATCTA CGTCTGTAAA ATTTGATACA 1140 

TAACGTCTCA CTACGTCATA ATTAATTGCw 1200 

TATACAGTAG GACCACATAC ATACATTTTT 1260 

TCTTTTTGAC GTGTAAGCGT ATTATATAAT 1320 

CTTTTCAAGT TGTCGTTCTA AATGCTTAAT 1380 

ATGATCAAAT GTTTTTCCAA CTCGAACACC 1440 

AACAACCGTT GAATAACTTG GAACTGATTG 1500 

ATTTGAATTT ATTTTAATAT TTCCTAAAAC 1560 

TCCTATATCT GGGTGTCTTT TCCCTCTTTC 1620 

ATAGATTGTC ACATTATCAC CAATTGTACA 1680 

ATCTATAAAT AGACGCTTTC CAATTTTAGC 1740 

TCTTGAAATT TGAGATATCXS CGCGTGCTGC 1800 

TGCAATCAAA TGACTCCAAA CTGCATGTAA 1860 

TGAACGTGCC GCTGGATCCT GCTCAAATAC 1920 

CAAGATCATT TCCTCCTCAA TGATTGAACT 1980 

ATTAAATATC AAAAAAGCAC CACTAACATA 2040 

GGTTCCACTC TGAATTTAGC GAATAACATT 2100 

AGGAAACTAA GTCAACTTAA TGCTCATCAC 2160 

GGTGCATTCA TTAATTTCTA CGTTGTACTC 2220 

AATTTAATTA CTAATCCTTC GTTTTATATA 2280 
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ATAAAATTCA AGTATATACT ACCTTGATCT 
CGGTTTAGCA CTTTTTCTTT ACCAAGTACT 
^ ATTTGGCCTG TTACAGCAAC ACGAATAGGC 

TCTTTTTGAA CTTCTTTAAT TGTCTTTTTA 
TCTAATTTAC TGAATAAGTG CGTCATTAAC 

10 

TGTTCTTCTT CACCAAGAGC TGGCATTTCT 
TCACCGGCAT AACTCATTTC TTTTT6ATAA 

^5 TCCTCTTCTG ACGGCACCTC AGGAATCAAA 

TGGAATACTG TTTCAGTATC TTTTTGTTTC 
TGCTTATCGA AAAATGCTGG TGATTTTGAC 

20 TCTTCTTTAG AAAAGATTTC TTCTTCACCT 

TTAAATAACG CTTCAGGTAA ATAACCTAAG 
TGCCCATCAC GTTTACTTAA CTTTTTACGT 

25 

AAACGAGGTG GCTCCCAGCC AAATGCTTCA 
ATATGATCAT CACCACGAAT TACATCTGAA 
GCAAAATTGT ACGTTGGAAT GCCATCTTTT 

30 

GAATCAAATG AAATATTTCC TTTTACCATA 
ACTCGGAAAC GAATTGATGG TTGGCGTCCT 

25 GTCAAATGCG CATGTTGACC ACCATAGCGA 

CGTTCAGCTT CTAATTCTTC TTCTGTCATA 
AACTGATCTA TTAATGGTTG GTAGATATGT 

40 CCATTGTCTT TATCTACAGA CTCATCCCAA 

TGTGATGTTT CTCCATCTTC TAAATTACGT 
AAATCTCCGT TGTAATGTTT AGCATACAAG 

45 

ATATGAAGAT ACCCAGTTGG ACTTGGTGCA 
TTCACTCCTA AATTAAATAT CAGATTTTCA 
TCTTCGACCG TCATAACAAA TGTCTAACTC 

SO 

TAACATGACC TTAAAATAAT TTCATTGTTT 
TTAAATTTTA AATAGAAAGC TGTTGTTTTT 

55 



TGTCTATTTC 


ATTACTTATA 


TTGTTTTAAA 


2400 


TCAATTGTAT 


TTGGTAATTC 


AGGACCATGC 


2460 


ATAAATAATT 


GCTTGCCTTT 


TATTCCTGTT 


2520 


ATTTCAGCCG 


CTTCAAATGG 


TTCAAGTGCT 


2580 


TCTGGTACTT 


GCTCTCCATT 


AATCACTTGT 


2640 


TTAAAGAACA 


TTTCTGATAA 


AGGTACAATT 


2700 


AGCGCAATTA ATTTOOGTCC 


CCAAGATAAA 


2760 


TTTGCTTTAA 


TTAAATGAGG 


TAATGCTAAT 


2820 


ATATATTGGT 


TATTAACCCA 


TGCTAATTTT 


2880 


AAACGCTTTT 


CATCAAAGAT 


TTTGATAAAT 


2940 


TCA6GAGACC 


AACCTAATAA 


CGCAATAAAA 


3000 


TCACGATATT 


GCTCAATAAA 


TTGTAAAATT 


3060 


TCTTCATTAA 


CAATTAATGA 


CATATGACCA 


3120 


TAAATCATAA 


TTTGTTTAGG 


CGTGTTTGAA 


3180 


ATTTGCATGT 


AATGATCATC 


TATAGCTACT 


3240 


TTTACGATAA 


CCCAGTCACC 


AATACCATTT 


3300 


TCATCAAATG 


AATACGTTTG 


GTTTTGAGGT 


3360 


TCTGCTTCAA ATTGTTGACG 


TTGTTCTTCA 


3420 


GGCATTTCAC 


CACGAGCGAT 


TTGCGCTTCA 


3480 


TAGCATTTAT 


ATGCTTTATC 


TTCTGCTAGT 


3540 


TGACGTTCAG ATTGACGATA 


TGGTCCGTAG 


3600 


TCTAATCCTA ACCATTTAAG 


ATTATCAAAT 


3660 


TTTTTATCAG 


TATCTTCAAT 


TCGAATCACA 


3720 


TAATTGAATA 


ATGCTGTTCT 


TGCATTACCA 


3780 


TATCTTACTC 


TTATACGATC 


GCTCATTTTT 


3840 


AGTTAGTTCA 


TATAAATTGT 


TCATTTGCTA 


3900 


GTCTTATTGT 


TAAAACGAAA 


CAATGCTTTT 


3960 


AATCATAACA 


TAATTCCCTG 


GGTAATATGC 


4020 


TCAACACTTT 


AAAAAAGCTA 


TCCCTAAGAA 


4080 
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TTAAACTTCA AATTAACTAT TCAAATACGT TAAAATTGAT TCTAATTTTG TATGTCTTGA 4200 

TTGCTATAAG AATAACTTTA TTAATATCTA AAATTTAACA CTTAATGAAC TTGTTTCAAT 4260 

5 

GATATATTAG CACTATTTGT ATTTTTTGAT AACTAATATG TTTTGCATTT ATTTATAGTT 4320 

ATACTTCAAA TTACAAACTt CGCCATTTCA TATACCTTTT AATATCTATT TTGTTTTCGT 4380 

CAACTACAGT TTTTATAATG ATACTGTATC TTCGATTTTT TTAGCAAAAA CAATTCTTCC 4440 

10 

TGAAGATGTT TGCAATAAGC TGACTACTTC TAAATTGACA TGACTGCCAA TAAGATTTTT 4500 

AGCATTATCA ACAACTACCA TCGTACCATC ATCTAGATAT CCTACTGCCT GACCAGGCtC 4560 

,5 CTTACCCATT TTTGTCAGTA AAATATGCAG TTGATCACCT TGATGTACAT TAGGTTTGAT 4620 

TGCTTCTGAT AAATCATTAA CATTTAATGC TTTGATACCA TGTACATGAC AAACTTTATT 4680 

TAGGTTGAAA TCTGTCGTTA TAATACTTGC ATGATATTGT TTTGCTAATT TTAATAACAT 4740 

20 CGTATCAATA TCACTATGTG TTTT A GTTGQ ATGTATAACC TTTGTAGGAT AGTCTAAATC 4800 

ATACAATTCA TTTAAAATAT CTAAGCCTCT TTTACCCTTT TCaCGTTTAA CACTGTCATT 4860 

TGAATCTGCA ACAATTTGTA ATTCATTAAT AACACCTTGT GGAATTAAAA TATTGCCATC 4 920 

GATAAAACCG CAACGAATGA CTTCTAAAAT ACGACCATCA ATAATTGCGC TTGTGTCGAT 4 980 

AATTTTTGGC GTAgcaCTTT TaGTATGTTG TGACATGGAA CGCGCTATAT TCTCAOGTAA 5040 

AAACATTAAC ATTTCATCTC GTTTTTTAAG GCCAAATTGG AAACCGAAAT AACATAGTAA 5100 

30 

TATCGTAATT ATGACAGGAA TGAAATGATT AAAAATAGAG TTGCCAATTG ATTCTAATAT 5160 

AAACGACACC ATAAGAGAAA TAAGTAATCC GATTATTAAA CCTATTGTTG CGAATAGTAT 5220 

3^ TTCAACAGCA CTTCTACGCA TAATAAAATG TTCTAAACCT TTTATAGCGT TAGTAACTCG 5280 

TCTAATAAAT ACACCAAAAA TTAAGAACAT AAAAATACTA CCGATAATGC CATCTACATA 5340 

GTGATTTTTT AAAAAGCTGG AGTTTT G TAA TCCAAGATCA TTTGCAATTT CAGGAATAAT 5400 

40 AATTATTCCT AATGCGCTCC CAATAATTAA GTAAATAATA ATAACCATTA GTTTAACGAT 5460 

ATTCACACAA TGTCCTCCTT TCTTGATGTT TTATGAATGA AGAGCAAATG ACAATACTTC 5520 

ATGTACAGTA GTTACACCTA TTACTTGTAT ACCTTCAGGA TATGTCCATC CGCCTATATT 5580 

45 ATTTTTAGGA ATAATTACAC GTTTGAAACC TAGTTTTGCA GCCTCTTGCA CGCGTTGTTC 5640 

TATCCGAGAT ACACGACGTA CCTCACCCGT TAAACCAACT TCTCCAATAT AGCAATCTAA 5700 

TCCGTCGACA GCTTTATCTT TAAAGCTAGA TGCAGTTGCT ACAATTACAC TTAAATCAAC 5760 

SO 

TGCTGGCTCC GTTAACTTTA CACCGCCAGC TACTTT6ATA TAAGCATCTT GTTGTTGTAA 5820 

TAGATAATTT TCTTTCTTTT CCAAAACAGC CATCAACAAA CTTAATCGAT TATGATCAAT 5880 

55 
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TATTAAAAGT GGTCTGGTTC CCTCCATGGT TGCAACAATT GTTGAACCTG GAACATTTGT 6000 

TGAACGTTCT TCTAAAAACA TTTCAGATGG ATTATTTACA CCTTTTAATC CACTTTGCTT 6060 

CATTTCGAAG ATTCcCATTT CATTCGTTGA ACCAAAACGG TTTTTAACAG CTCGCAAAAT 0120 

TCGATATGCG TGGTGTTCAT CGCCTTCAAA ATAAAGCACA GTATCaACCA TGTGTTCTAG 6180 

CAATCTTGGG cCCAGCAATT TGACCTTCTT TCGTTACATG ACCCACTATA AAAGTTGCaA 6240 

TGTTCATTTG TTTAGCAATA TTCATTAAAC TTTGTGTACT TTCACGAACT TGTGAAACAG 6300 

AACCTGGCGC AGAGCTGATT TCAGGATGAT ATATTGTTTG AATCGAATCC ACTACTAATA 6360 

AATCAGGTTG TTCTTCTTTT ACTGTTTGAT AAATAACTTC AAGATCTGTT TCAGCTAATA 6420 

CTTGCAATTC ACTTGAATCT TCATCTAATC GCTCTGCACG TAATTTAGTC TGACTAAGCG 6480 

ATTCTTCTCC AGTAATATAT AGTACTTTTT TCTTTTGAGA TAACGATGCA CAAATTTGTA 6540 

20 AAAGTAACXrr TGACTTACCA ATACCTGGAT CCCCACCAAT AAGTACTAAC GATCCGCTCA 6600 

CAATACCTCC ACCTAATACA CGGTTGAATT CTGCTGAATC TGTTAACACT CTCGGCGTTG 6660 

TTTCATGTTT AATACTATTT AATTTTTGTA CTTTACCTGC TAATTCCTTG GTTTTAACTC 6720 

CATGTTTAGG ATTGGCTGCT TTTTCAACAA TTTCCTCCAT TTGATTCCAA GCGCCACAAT 6780 

TAGGACATTT CCCCATCCAT TTAGGAGATT GATAACCACA AGCCATACAT TCAAAAATCA 6840 

CTTTTTTCTT GGCCArAATT GCAcCTCCAC TTTCTT 6876 
(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1193 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doilble 

(D) TOPOLOGY: linear 



25 



30 



35 



40 ■ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

CAACTCAAAC AGCAGAACAA CGTCGTGAGT TGATTAATCG TGTATTTACT GACATTAATC 60 

CCATACATTA AAAATATGAT GTACGTGTTA GCAGATAATA GACATATCTC ATTAATAGCT 120 

GACGTATTCA AGGCGTTCCA AAGCTTATAT AACGGACACT ACAATCAAGA TTTTGCAACA 180 

ATTGAGTCAA CATATGAATT GAGTCAAGAA GAGTTAGATA AGATTGTCAA ACTAGTAACT 240 

CAACAAACGA AGTTATCTAA AGTTATTGTA GATACAAAAA TTAATCCAGA TTTAATTGGT 300 

GGATTTAGAG TTAAAGTCGG CACAACTQTA TTAGATGGTA OTGTTAGAAA TGATCTTGTC 360 

CAATTACAAA GAAAATTTAG AAGAGTTAAT TAATTATAAA GAGGAGT6AC ATAGATGGCC 420 

55 



45 



50 
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10 



20 



ATGTCCGTAA 


CTGATGTAGG 


TACTGTATTA 


CAAATTGGTG 


ATGGTATTGC 


ATTAATTCAC 


540 


GGATTAAATG 


ACGTTATGGC 


TGGTGAGCTA 


GTAGAATTCC 


ATAACGGCGT 


ACTTGGTTTA 


600 


GCCCAAAACC 


TTGAAGAGTC 


AAACGTGGGT 


GTGGTTATTT 


TAGGACCATA 


CACAGGTATT 


660 


ACTGAAGGTG 


ACGAAGTTAA 


ACGTACTGGT 


CGTATCATGG 


AAGTACCAGT 


AGGTGAAGAA 


720 


CTAATCGGAA 


GAGTTGTTAA 


TCCATTAGGA 


CAACCTATTG 


ATGGAcAAGG 


ACCGATTAAC 


780 


ACAACTAAAA 


CACGTCCaGT 


AGAGAAAAAA 


GCTACTGGTG 


TAATGGATCg 


TAAATCAGTA 


840 


GATGAGCCAT 


TACAAACAGG 


TATCaAAGCA 


ATT6ATGCTT 


TAGTACCAAT 


TGGTAGAGGT 


900 




TAATCATCGG 


TGAC06TGAA 


ACAGGTAAAA 


CAACAATTGC 


AATTGACACA 


960 


ATTTTGAACC 


AAAAAGATCA 


AGGTACXSATT 


TGTATCTATG 


TTGCTATTGG 


TCAAAAAGAT 


1020 


TCAACAGTAA 


GAGCAAATGT 


TGAA7UVGTTA 


AGACAAGCAG 


GCGCTTTAGA 


CTACACTATT 


1080 


GTTGTAGCAG 


CATCAGCTTC 


TGAACCTTCT 


CCATTATTAT 


ATATTGCACC 


ATATTCAGGT 


1140 


GTAACAATGG 


GTGAAGAATT 


CATGTTTAAC 


GGTAAACATG 


TTTTAATCGT 


TTA 


1193 



(2) INFORMATION FOR SEQ ID NO: 188: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5549 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 188: 

35 TCCTAAGAAG TCAAAATAAA CTAACTATnA AACATCTAGT ACGATTATTA AAGTGACAGA 60 

TnATAAAATT GAATTATTnA GAGAAGGAGA TATAAAGTTT GAAGAAATAA AAGAAAGACT 120 

AGGTACAGGT ATTATITATG AATAAGTTAA TACTTGGGAT TTATTTATAC CGAATTTTTT 180 

CACGAGCATA CTTTTATTTA CCGTTTTTAT TAATTTACTT TTTGATTCAA GGTTATTCCA 240 

TAATACAATT AGAAATATTA ATGGCGTCTT ATGGCATTGC AGCATTTTTA TTCTCTCTAT 300 

ACAAAGAGAA GTGTTTTAAA ATTTGTAACT TAAAAGATTC TAATAAATTA GTTGTTAGTG 360 

45 

AAATATTCAA AATCATCGGT TTATTGTTGT TATTATATCA AAATCAATAT TTAATTTTAG 420 

TAGTGGCACA AATATTATTA GGGTTAAGTT ACTCAATGAT GGCGGGTGTT GATACCGCAA 480 

TAATTAAAAG AAATATAACA AATGAGAAAT ACGTACAAAA TAAGTCAAAT AGCTATATGT 540 

SO 

TCCTATCATT ATTAATTTCA GGGATTATAG GTAGTTATCT TTATGGAATA AATATTAAAT 600 

GGCCTATAAT AATGACTGGT ATATTTTCAA TTCTAACAAT TATAATTATT CGATGCACAT €60 

55 
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10 



TACCAGAAGA GAAGTTTTGG ATATTGCATT ATTCTTTTTT AAGAGCGTTA ATATTAGGAT 780 

TTTTTATAGG ATTTATTCCA ATTAATATAT ATAATGATTT AAAACTGAAT AATTTACAAT 840 

TTATTTCAGT ATTAACTTGT TACACAGTTA TGGGTTTTGT ATCTTCACGT TATTTAACTA 900 

AATACTTGAA TTATAAGTTT GTGTCAGAAA TTTGTTTAGT AATATTTTTA ATAATATATA 960 

CATATCAAAG TTTCATAGCA GTTACTATTT CTATGATATT TTTAGGTATT TCTTCAGGGT 1020 

TAACTCGTCC ACAAACTATA AATAAACTTT CTAGCAGTAG TAACTTAAGA GTGATGCTTA 1080 

ATTATGCAGA AACGTTATAT TTTATTTTTA ATATCGCATT TTTACTTATG GGTGGTTACT 1140 

IS TATATACAAT AGGAACTATT CAATACTTAA TATTATTTAT TTCGTTATTA ATTTTTATAT 1200 

ATTTAATAAT AATATTTyAT TTTACAAGGA GAGAGCAACA TGAAAATAAA AACTGAATTT 1260 

AAAGGGAACA ATATACCATA TGAATACGCA GCAGGTGCAG ATGTGAGTGA TTCTATTAAC 1320 

GGGAATCCAA TTAAGTCATT TCCATTTGAA GTAATTGAAT TACCGGAAGG gACTAAATAT 1380 

CTTGCTTGGT CTTTAATTGA CTATGATGCA ATTCCTGTAT GTGGCTTTGC TTGGATTCAT 1440 

TGGAGTGTAG CTAATGTAAG TGTTAGTGGC AATTCAATTT CTATAAAAGC AGATTTATCA 1500 

AGAACAAAGG GCGACTATGT ACAAGGTAAA AATAGCTTTA CTAGTGGGTT GTTGGCTGAA 1560 

GATTTTTCAG AAATAGAAAA TCACTATGTA GGACCTACAC CACCTGATCA AGATCATCAA 1620 

TATGAATTAA CAGTTTATGC GTTAGATCAT TCTTTAAATT TGAAGAATGG GTTCTACTTG 1680 

AATGAATTTT TAAAAGAAGT AAATCAACAT AAAATTGATC AAACAAGTAT TAACCTTATA 1740 

GGAAGAAAAA TTTAATACTA AATATCTCAT CAATATAAAA TTGTTCAATT AAAAGTACAA 1800 

AGAAACAAAG GTTTTAATTT ATATATTAGG TACGGCGTTC GCTATAATGC AAAGAAGTAA 1860 

TTAAATTTAA GAAATGTAAA CTTAGTTATT GTAATGTGAA TTTATTTGAA AAAATAGAAA 1920 

GTATTAACAA TTATAGCTTT TACATTAATT AAAATTTATT TTTAAAAACA AGTAAACAAT 1980 

40 TTACATACTT ATAATTTTTG AAAATTTTCA ATTTGTGTTA TATTGATTTT GTAAGATACT 2040 

TTAACTCACA AAGGAGAGAG AGTATATGAA ATTAAAATCA TTTATAACTG TAACTTTGGC 2100 

ACTGGGCATG ATCGCAACGA CTGGCGCTAC TGTGGCAGGT AATGAGGTAT CTGCAGCAGA 2160 

AAAGGACAAA CTACCGGCAA CTCAAAAAGC TAAAGAAATG CAAAATGTTC CATATACAAT 2220 

TGCAGTAGAT GGCATTATGG CTTTCAATCA ATCTTACTTA AATTTACCAA AAGATAGCCA 2280 

ATTATCATAT TTAGATTTAG GAAATAAAGT TAAAGCTTTG TTATATGATG AACXSCGGTGT 2340 

AACACCTGAG AAGATTCGAA ATGCAAAATC TGCCGTTTAC ACGATTACTT GGAAAGATGG 2400 

TAGTAAAAAA GAAGTGGATC TTAAGAAAGA TAGCTACACA GCAAACTTGT TTGATTCAAA 2460 
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CAACATGAAG CATTTAATTT TACAGTGATG ATTATAAAAT AATTGCCTTG ATACAAAGAT 2580 

TACTCGTAAA TGACATCTTT GTATTAAGGC TTTTTCTAAA TTTAAAAGTG ATGGGTTAGA 2640 

GGTCATTGAG CTTTAAAATA TTCAAAATAC AAAACATTAA TGGCCAAAAA TAAAAGCCGC 2700 

CTTTATCTGG GCAGCTTCAA TAATAAGAAA GACATATTTC ATTTTATACT AAATAGTTAT 2760 

TGTGATGAAT CTTTCGGCGG TTTAATTACT GCAGCAAAAA TTGCTGTGAA AATCGTGAAC 2320 

AATACTGCCA TGATAATTGG ATTCACTACA TTTAAGCTGT CTCCACCTAC TAGGCTATTA 2880 

AGTACAAAGT TAACCATTTO CATTAATAAT AATGCCCAAA AGAATGTTAC GAGGTGTTTC 2940 

ATGTCATTCT ACCTCCACTT TAATTATATA TATTTTATTT TAAGTGAAAG TTAGAAATTT 3000 

GTATAGTAAC ATCTCATATA TTTTGACCAT ATTATACAGT TTAAATAAAT GATTTTATCT 3060 

GAATGGCTAT TCTAAATTAA GCGCATTAAA ACCAATTTCA TACTGAAATT TGACGATAAT 3120 

20 AAAGCATTAA AATTTTATTA ACTAGTCAAT ATTCCTACCT CTGACTTGAG TTTAAAAAGT 3180 

AATCTATGTT AAATTAATAC CTGGTATTAA AAATTTTATT AAGAAGGTGT TCAACTATGA 3240 

ACGTGGGTAT TAAAGGTTTT GGTGCATATG. CGCCAGAAAA GATTATTGAC AATGCCTATT 3300 

TTGAGCAATT TTTAGATACA TGTGATGAAT GGATTTCTAA GATGACTGGA ATTAAAGAAA 3360 

GACATTGGGC AGATGATGAT CAAGATACTT CAGATTTAGC ATATGAAGCA AGTTTAAAAG 3420 

CAATCGCTGA CGCTGGTATT CAGCCCGAAG ATATAGATAT GATAATTGTT GCCACAGCAa 3480 

CTGGaGATAT GCCATTTCCA ACTGTCGCAA ATATGTTGCA AGAACGTTTA GGGACGGGCA 3540 

AAGTTGCCTC TATGGATCAA CTTGCAGCAT GTTCTGGATT TATGTATTCA ATGATTACAG 3600 

CTAAACAATA TGTTCAATCT GGAGATTATC ATAACATTTT AGTTGTCGGT GCAGATAAAT 3660 

TATCTAAAAT AACAGATTTA ACTGACCGTT CTACTGCAGT TCTATTTGGA GATGGTGCAG 3720 

GTCSCGGTTAT CATCGGTGAA GTTTCAGATG GCAGAGGTAT TATAAGTTAT GAAATGGGTT 3780 

40 CTC^ATGGCAC AGGTGGTAAA CATTTATATT TT^TAAAGA TACTGGTAAA CTGAAAATGA 3840 

ATGGTCGAGA AGTATTTAAA TTTGCTGTTA GAATTATGGG TGATGCATCA ACACGTGTAG 3900 

TTGAAAAAGC GAATTTAACA TCAGATGATA TAGATTTATT TATTCCTCAT CAAGCTAATA 3960 

TTAGAATTAT GGAATCAGCT AGAGAACGCT TAGGTATTTC AAAAGACAAA ATGAGTGTTT 4020 

CTGTAAATAA ATATGGAAAT ACTTCAGCTG CGTCAATACC TTTAAGTATC GATCAAGAAT 4080 

TAAAAAATGG TAAAATCAAA GATGATGATA CAATTGTTCT TGTCGGATTC GGTGGCGGCC 4140 

TAACTTGGGG CGCAATGACA ATAAAATGGG GAAAATAGGA GGATAACGAA TGAGTCAAAA 4200 

TAAAAGAGTA GTTATTACAG GTATQGGAGC CCTTTCTCCA ATCGGTAATG ATGTCAAAAC 4260 
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TGAACCTTAT AGCGTTCACT TAGCAGGAGA ACTTAAAAAC TTTAATATTG AAGATCATAT 4380 

CGACAAAAAA GAAGCGCGTC GTATGGATAG ATTTACTCAA TATGCAATTG TAGCAGCTAG 4440 

AGAGGCTGTT AAAGATGCGC AATTAGATAT CAATGAAAAT ACTGCAGATC GAATCGGTGT 4500 

ATGGATTGGT TCTGGTATCG GTGGTATGGA AACATTTGAA ATTGCACATA AACAATTAAT 4560 

GGATAAAGGC CCAAGACGTG TGAGTCCATT TTTCGTACCA ATGTTAATTC CTGATATGGC 4620 

AACTGGGCAA GTATCAATTG ACTTAGGT6C AAAAGGACCA AATGGTGCAA CAGTTACAGC 4680 

ATGTGCAACA GGTACAAATT CAATCGGAGA AGCATTTAAA ATTGTGCAAC GCGGTGATGC 4740 

75 AGATGCAATG ATTACTGGTG GTACAGAAGC ACCAATTACT CATATGGCAA TTGCTGGTTT 4800 

CAGTGCAAGT CGAGCGCTTT CTACAAATGA TGACATTGAA ACAGCATGTC GTCCATTCCA 4860 

AGAAGGTAGA GATGGTTTTG TTATGGGTGA AGGTGCTGGT ATTTTAGTAA TTGAATCTTT 4920 

2^ AGAATCAGCA CAAGCTCGAG GTGCCAATAT TTATGCTGAG ATAGTTGGCT ATGGTACTAC 4980 

AGGTGATGCT TATCATATTA CAGCGCCAGC TCCAGAAGGT GAAGGTGGTT CTAGAGCAAT 5040 

GCAAGCAGCT ATGGATGATG CTGGTATTGA ACCTAAAGAT GTACAATACT TAAATGCCCA 5100 

TGGTACAAGT ACTCCTGTTG GTGACTTAAA TGAAGTTAAA GCTATTAAAA ATACATTTGG 5160 

TGAAGCAGCT AAACACTTAA AAGTTAGCTC AACAAAATCA ATGACTGGTC ACTTACTTGG 5220 

TGCAACAGGT GGAATTGAAG CAATCTTCTC AGCGCTTTCA ATTAAAGACT CTAAAGTCGC 5280 

ACCGACAATT CATGCGGTAA CACCAGATCC AGAATGTGAT TTGGATATTG TTCCAAATGA 5340 

AGCGCAAGAC CTTGATATTA CTTATGCAAT GAGTAATAGC TTAGGATTCG GTGGACATAA 5400 

CGCAGTATTA GTATTCAAGA AATTTGAAGC ATAACTATAA nAATCTTCAG TAACGTTGTT 5460 

TTAGTTACTG AAGATTTTTT CaGTTTCTTT ATACTAAGAT GAGCGACAcA CAATCGTCAT 5520 

AATAAAATAT GAATATTTAT TAATAATAA 5549 
(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4832 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 
AGATTATAGT AAGATTGATA GTTTGGOGAC TGaAGCgCGa GaAAAATTAT CAGaAGTAAA 60 
mCCTTTAAAT ATTGCACAAG CTTCTAGAAT ATCAGGGGTA AATCCAGCAG ACATATCTAT 120 
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TGGTTAGCAG AACAATTAAA AGAACATAAT ATTCAATTAA CTGAGACTCA AAAACAACAG 240 

TTTCAAACAT ATTATCGTTT ACTTGTTGAA TGGAATGAAA AGATGAATTT GACAAGTATT 3 00 

ACAGATGAAC ACGATGTATA TTTGAAACAT TTTTATGATT CCATTGCACC TAGTTTTTAT 360 

TTTGATTTTA ATCAGCCTAT AAGTATATGT GATGTAGGCG CTGGAGCTGG TTTTCCAAGT 420 

ATTCCGTTAA AAATAATGTT TCCGCAGTTA AAAGTGACGA TTGTTGATTC ATTAAATAAG 480 

10 

CGTATTCAAT TTTTAAACCA TTTAGCGTCA GAATTACAAT TACAGGATGT CAGCTTTATA 540 

CACGATAGAG CAGAAACATT TGGTAAGGGT GTCTACAGGG AGTCTTATGA TGTTGTTACT 600 

IS GCAAOAGCAg TAGCTAGATT ATCCGTGTTA AGTGAATTGT GTTTACCGCT AGTTAAAAAA 660 

GGTGGACAGT TXGTTGCATT AAAATCTTCA AAAGGTGAAG AAGAATTAGA AGAAGCAAAA 720 

TTTGCAATTA GTGTGTTAGG TGGTAATGTT ACAGAAACAC ATACCTTTGA ATTGCCAGAA 780 

GATGCTGGAG AGCGCCAGAT GTTCATTATT GATAAAAAAA GACAGACGCC GAAAAAGTAT 840 

CCAAGAAAAC CAGGGACGCC TAATAAGACT CCTTTACTTG AAAAATAATG CATAATCCTT 900 

TACAACTAAC ATAAAAGGAG CGAATGGATA ATGAAAAAAC CTTTTTCAAA ATTATTTGGT 960 

25 

TTGAAAAACA AAGATGACAT CATTGGACAT ATTGAAGAAG ATCGCAATAG TAATGTTGAA 1020 

TCCATTCAAA TTGAACGTAT CGTTCCCAAC CGTTATCAAC CAAGACAGGT GTTTGAACCA 1080 

AATAAAATTA AAGAACTTGC TGAATCAATA CATGAACATG GTTTACTACA ACCTATTGTT 1140 

30 

GTAAGACCGA TTGAAGAAGA TATGTTTGAA ATTATTGCTG GAGAGCGCCG ATTTAGAGCA 1200 

ATACAATCAC TAAATTTACC TCAAGCAGAC GTTATTATTC GTGATATGGA TGATGAAGAG 1260 

35 ACGGCTGTTG TTGCATTAAT TGAGAATATT CAAAGAGAAA ATTTGTCTGT TGTTGAAGAA 1320 

GCGGAAGCCT ATAAGAAATT ATTGGAAATT GGTGATACAA CGCAAAGTGA ATTGGCAAAA 1380 

AGTTSAGGTA AAAGTCAAAG CTTTATTGCA AATAAGTTGC GTTTATTGAA GTTGGCGCOG 1440 

AAAGTACTAC TTOGCTTAAG AGAAGGTAAA ATTACTGAAC GTCATGCGAG AgcGGtATTA 1500 

TCATTGTCTG ATAGCGAACA AGAAGCGTTG ATTGAGCAAG TCATTGCACA AAAGCTAAAT 1560 

GTGAAcAGAC TGAAGATAGA GTACGCCAAA AAACGGGGCC CGAAAAAGTC AAAGCACAAA 1620 

45 

ACCTTCGCTT TGCACAAGAT GTCACTCAAG CACGAGATGA GGTAGGCAAA AGTATCCAAG 1680 

CGATTCAACA AACAGGATTA CATGTTGAGC ATAAAGACAA AGATCATGAA GATTATTATG 1740 

AAATAAAAAT TCX5AATATAT AAACGTTaGT AGTAGGATGT CGTATACATG ATGACTAACA 1800 

50 

CATAAAAGAC AAAGCTAAGA TCATAACAGC TTTGTCTTTT TTTTTTGTTT TACGTGAAAC 1860 

ATAAAAATTT ATATTTATAT GTTGATCAGG CTGGTACATA AATCAATGTT CTATGCTCTA 1920 
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TTCTAGTCAA CCTTGCTGGG GTGGGACXSAC GAAATAAATT TTGCGAAAAT ATCATTTCTG 2040 

TCCCACTCCC TAATTTGAGC TGGATATACT TTCATTTGAA CCCTTTATTG CTAGTTTATG 2100 

AAAGTATCAT GAAAGCTTTA TGAACATCGC TTGAGTTGCC TTTACAGTAG AAAATTTAAG 2160 

TTTTACACTT TGTGTGAATG ATACGTTTTG TATTGAATTA ATTATAGAAA GGTACGTTGA 2220 

AGATGTTTTC AATTGGAAGT GCAATTCTTC ATTTTGTCAT TGGTGGTATC GCTGTTGCAT 2280 

TAGCTTCAAT TATTGCTGAT AAGGTAGGTG GTAAGTTAGG AGGTATTATA GCTACTATGC 2340 

CGGCAGTCTT TCTTGCGGCT ATTATCGCAT TAGCTTTAGA TCATCGTGGT ACGCAATTAG 2400 

TGGAGATGTC GATGAATCTT AGTACTGGAG CAATTGTCGG TATTCTGTCT TGTATATTAA 2460 

CTGTATTTTT GACATCTCTC TACATTAAGC ATAAAGGTTA TCGGAAAGGC GCAATATTCA 2520 

CAGTTGTTTG TTGGTTTGTC ATTTCCCTCG CAATATTCAG TATTAGACAT TTATAGTTTG 2580 

20 GAAAATGCGT 6ATAATTAGT TGTATTCAGT TATTAAGTAA TAAATTATTG GAGGCAGAAC 2640 

ATCATGAAAT TAACATTAAT GAAATTTTTT GTGGGGGGAT TTGCAGTATT ATTAAGTTAT 2700 

ATTGTATCTG TAACACTACC TTGGAAAGAA TTTGGCX3GTA TATTTGCaAC GTTTCCGGCA 2760 

25 GTATTTTTAG TGTCTATGTT TATTACAGGT ATGCAATATG GTGATAAAGT CGCTGTGCAT 2820 

GTAAGTCGTG GCGCAGTGTT TGGTATGACA GGGGTATTAG TTTGTATTTT AGTTACATGG 2880 

ATGATGTTAC ATATGACGCA CATGTGGTTG ATTAGCATTG TTGTTGGTTT CCTAAGCTGG 2940 

TTCATCAGTG CAGTATGTAT TTTTGAAGCX; GTAGAATTTA TAGCACAAAA AAGATTAGAA 3000 

AAGCATAGTT GGAAAGCTGG AAAATCX5AAT AGTAAATAGT GTGAACGTAA TCTCTTAACT 3060 

AGGACTAACT TTGCAAGCAT TGAATAGCAT GGAAAAGTTG CATCATTAAT AAGTGAAATT 3120 

CAAGTTGGCA TTGAGAAAAT TACAAGCGCG TAATCATACa GGTCTGTCTT AAGGGAGTCT 3180 

TCGftACCCCG ATGTTGTCGT ATGTCAAAAC ATTTA6TCAA TCATAAAGGT GACTTGATTT 3240 

AACTTTATCT GATAGTCTGA TTGTAATGAT TGTACTAATT GACTGGAGGC GTATGTAATT 3300 

GAATCTQAGT AAACAAATTA AAAAGTATAG GGAACGAGAT GGTTATTCAC AAGAATATCT 3360 

TGCTGAAAAG TTATATGTAT CTAGGCAGAG TATTTCTAAT TGGGAAAATG ACAAAAGCTT 3420 

^ ACCAGACATA CATAACTTAT TAATGAyGTG TGAATTGTTC AATGTAACTT TAGATGATTT 3480 

AGTAAAAGGG ACCATTCCAT TTGTACCTGA TATTAAAGCG CAACX3AAGTC TTAACTTATG 3540 

GACATATGTG ATGCTTATTT TCATGACATT AGCTGCAATT TTAATGGGAC CTTTAGTTGT 3600 

SO 

TTATTGGAAT TGGACTTGGG GTGTAACGGT GGCAATCATT TTGGGAATAG GTTTTTATGC 3660 

ATCTATGAAA ATAGAAGATT TAAAAAAAGT GCATAAAATG GACAACTACG ATCGAATTGT 3720 
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GACAAATGCG 


CTTTCTATTA 


TATCAGTAAT 


TGGTATACTC 


AGCCTCATAA TTTTCCTTAG 


3840 


TGTGTATTTG 


GCAAATAAGT 


TTTTATAAAT 


CATCGTGGTA 


TCGTCTCATA TTATTTATAT 


3900 


TATCCAAAAT 


AGCATAAAAA 


AATACCAACA 


AGATTTAGAA 


CCTTGTTGGT AATCAAAGCG 


3960 


aTTCATTTAT 


AATGAGTCGT 


TTTATGTTGT 


AAGATTAAAC 


AGTTTGTACG TTAACTGCTT 


4020 


GGTCTCCACG 


TTGACCTTCA 


GTGATTTCGA 


AAGTAACTTT 


TTGACCTTCT TCTAAAGTTT 


4080 


TGTAGCCATC 


GCTAGCGATA 


CCTGAiGAAAT 


GTACGAATAC 


GTCTCCGCCA TTTTCTTGTT 


4140 


CGATGAAACC 


AAAACCTTTT 


TCTGCrTTAA 


ACCATTTwAC 


TGTACCX3TTA TTCATATwGA 


4200 


AwACCTCCGT 


gTGCTTTTGC 




TGTAACAAAT 


TCATAACTAA AAAAGAGGAT 


4260 


ATTCTAAACA 


AATACACTAC 


AATTTAATTC 


ACGA6CTTTT 


ATTAC3GTAAG ACCAACTATA 


4320 


CGCTCATA1T 


GGCATAATGT 


ACAGTGTTTT 


TTGAAAATAA ATTAAAAAAG ATTTTTAAAA 


4380 


ACCTTAGAAA 


CGTTGATTTA 


AAGGGGTTTA 


TAAAAATwAw 


AAAATTGTAG TCTTTTATGG 


4440 


TGTTTGCTAG 


TTTTCAAAGT 


GACATATCGT 


TTAAACATGA 


TGATTTTATA AGCAATCCAT 


4500 


AAAAAACAAG 


CAGCGATAAA 


CGCTACTTGT 


TGATATTAAA 


ATCTGACTTG AAAGGTCATA 


4560 


GCAATGTTCT 






CTTGCCTTTT 


TCTTCACGAC GTTTTAAATA 


4620 


ATAAGAGCCA 


CCTAATAAAC 


CAGCTGGAAT 


GCCTATCATT 


GGTGTTGTGA ATGAGCTTAA 


4680 


TACAATAACA 


AGTATTGTTA 


AAGCAATGAC 


GTTATACCAA GTTACA6TCA AATTTTTCAA 


4740 


ATCCTCATAT 


GATTGTTTTA 


CTAATTCTCT 


AAATTTCATG ATTCAATCTC TCCTTTTTTA 


4800 


TAAATCTTTA 


GATTGTCAAA 


TTAAGCTGGA 


CA 




4832 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5727 base pairs • 
r (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

CAAAGCTGTT CAAAAGGCTT ATAATTTAAA TTTAGATAAC ATAOGTACAA TGGAACCTAA 60 

GTTGAGATAT CAAGCGATCA ATAAAGGTAA TATTAATTTA ATAGATGCAT ATTCAACTGA 120 

CGCTGAATTA AAACAATATG ATATGGTTGT GTTAAAAGAT GATAAGCACG TATTTCCACC 180 

ATATCAAGGA GCACCATTAT TTAAAGAAAG CTTTTTAAAG AAACATCCAG AAATTAAGAA 240 

ACCGTTAAAC AAACTAGAAA ACAAAATATC TGATGAAGAT ATGCAAATGA TGAACTATAA 300 
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GTTAATCAAA TAACGACCAA CGCCACATAA GATGCGTAAC ACCAAATTAT ATCTTATGTG 420 

GCGTTGTTAT ATTTAAATCT ATAATTATGT TCAATTTAAA CATGCAATAA TGATTAAAAA 480 

ATATGACATG TTAAACACAA TGTAAGCTAT TATGATGTGA AAATAGTAGC ATTGCATTTT 54 0 

AGAAACATAG AGCGATATAA TGAATATAAG TTTTTTGAAA TTTCAGTTAA TTCTAAGGAG 600 

GTTGTTTTTA TTATGAAAGA ACAACTTAAT CAACTATCAG CATATCAGCC TGGTTTATCT 660 

CCAAGGgCAT TGAAAGAAAA GTATGGCATT GAAGGAGATT TATATAAACT TGCATCAAAT 720 

GAAAATTTGT ATGGACCATC GCCTAAAGTT AAAGAAGCGA TATCAGCACA CTTAGATGAG 780 

TTATATTATT ATCCTGAAAC AGGATCACCG ACATTAAAAG CGGCGATTAG TAAACATTTA 840 

AATGTAGATC AATCACGCAT TTTATTTGGT GCGGGATTAG ATGAAGTTAT ATTAATGATT 900 

TCTAOAGCTG TATTAACGCC AGGGGATACT ATTGTTACAA GTGAAGCGAC ATTCGGTCAA 960 

20 TATTATCACA ATGCGATTGT TGAATCAGCT AATGT6ATAC AAGTACCTTT AAAAGATGGT 1020 

GGCTTCGATT TAGAAGGTAT TTTAAAAGAA GTTAATGAAG ATACGTCATT GGTATGGTTA 1080 

TGTAATCCAA ATAATCCTAC AGGTACATAT TTTAATCATG AGAGCTTAGA TTCGTTTTTA 1140 

2S TCTCAAGTAC CTCCACATGT ACCAGTAATT ATAGATGAAG CTTATTTTGA ATTTGTGACA 1200 

GCAGAGGACT ACCCGGATAC ACTTGCTTTG CAACAAAAAT ATGACAATGC TTTCTTATTA 1260 

CGTACATTTT CAAAGGCGTA TGGATTAGCG GGTTTACGTG TAGGATATGT GGTAGCAAGT 1320 

GAACATGCGA TTGAAAAATG GAACATCATT AGACCACCAT TTAATGTGAC ACGTATATCT 1380 

GAATACGCAG CAGTTGCAGC ACTTGAAGAT CAACAATATT TAAAAGAGGT AACACATAAA 1440 

AATAGTGTTG AACGCGAAAG ATTTTATCAA TTACCTCAAA GTGAGTATTT CTTGCCAAGT 1500 

CAAACGAATT TTATATTTGT AAAAACmAAG CX3GGTAAATG AACTTTATGA AGCACTTTTA 1560 

AATOTAGGGT GTATTACGCG ACCATTTCCA ACTGGTGTTA GAATTACAAT TGGTTTTAAA 1620 

GAACAAAATG ATAAAATGTT AGAAGTTTTA TCAAACTTTA AATACGAATA GTAAGTGGGG 1680 

AGTGGGACAG AAATGATATT TTCGCAAAAT TTATTTCGtC GTCCCACCCC AACTTGcATT 1740 

GTCTGTAGAA ATTGGGAATC CAATTTCtCT TTGTTGGGGC CCCJGCCGGCA AGGTTGACTA 1800 

45 GAATTGAAAA AAGCTTGTTA CAAGCGCATT TTCGTTCAGT CAACTACTGC CAATATAACT 1860 

TTGTAGAGCA TTGAACATTG ATTTATGTCT CAAGCTCAAT GCAOTGTGAA TGATGAGGTG 1920 

AGAGTATTCA GTGTAAAAAG CAACAATAGA TGATATTGTT TTGTATCAAT TGCTTTTTTG 1980 

SO CTATACTGAA TCAATACTGA TATTTTCAGG AGAAGATTAA AATGACCCGT AAATCAATCG 2040 

CGATTGATAT GGATGAAGTA TTGGCAGATA CATTAGGAGA AATCATTGAT GCTGTCAATT 2100 
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TTCCTGAACA TGATGGATTA ATTACAGAAG TATTGAGAGA ACCAGOCTTC TTCAGACATC 2220 

TTAAAGTGAT GCCGTATGCA CAAGAAGTTG TGAAAAAATT AACTGAACAT TATGATGTAT 2280 

5 

ATATTGCTAC AGCAGCAATG GATGTACCAA CATCATTTAG TGATAAATAT GAATGGTTAC 2340 

TAGAGTTCTT TCCATTTTTA GATCCTCAGC ATTTTGTTTT TTGTGGTAGA AAAAACATCG 2400 

TTAAAGCTGA TTATTTAATA GATGACAATC CTAGACAGCT TGAAATTTTT ACTGGTACAC 2460 

10 

CGATTATGTT TACAGCAGTG CATAATATTA ATGATGATCG ATTTGAACGC GTAAATAGCT 2520 

GGAAAGATGT AGAACAGTAT TTTTTAGATA ATATTGAGAA ATAAAATATA TCACTTGAAA 2580 

AATTTCATGT AGAAAAGATG ATGGATAGGC TATAAAGTAA TTGT6ACTGA GATGAACTTT 2640 

TATGTCTTAG ACACTACAAC ACTATATTGG CAGTAGTTGA CTGCGGGGCC CCAACATAGA 2700 

GAAATTGGAT TCCCAATTTC TACAGACAAT GCAAGTTGGG GTGGsCCCCA ACATAAAGAA 2760 

20 ATACTTTTTC TTTAGAAATT AGTATTTCTT ATGCATGAGT GTAACTCATG CATTCATATT 2820 

TTTAAGTACA CATTAGCTGT 6ACTAAT6AT AAAGAATCGC TACATAATCA ATCATTAGTC 2880 

GTTCTTTATC ATTTCCGTCC CGCTCTCAAT AAATGTTAGT CTATCTTATT ATTATAAATC 2940 

GGATGAATGT GTTT^TCTAT GGCAGATTAC ACGTCATCCG ATTTTTTATA GAATTTGAAA 3000 

AAGACGCATA AACCACTATG ATTTAAAATA CAACATCAAT CATTTTAGTG gCATGCGCCA 3 060 

AAATTATATG TCTGTTTTTG AAACAGGGTA ATAGCTTAAA GCTAATAAAA ACGAATATAA 3120 

30 

GGTGCGTTGA ATCTTATGAT TACACTCCAA ACCTAATATA ATATCGGGTT AAGATCATTC 3180 

CGGATGCTTA CAAATCATTG ACAGTAAGTA ACTGAATGGC ATTTGGTATA ACCTCAATAT 3240 

CAATAGGTGT TTCTAATGAA ATTTCGCCAT CAATATCAAC TTTCATTGCT GGATCTGTTG 3300 

35 

TAAGTGAAAT CTTTTTACCA GGTATATGCT CAATACCTTG AGTAATTTCA TTCCaATTCA 3360 

TGCTATCAC6 CTTTTTAAAA ATATCATTTA AAATACTGAA ACTTTGTTCA TTAAAAATGA 3420 

AAGTGTTCAG TTCACCATCT TGAGQAGACA AATCAGTCaA TGGTATACQA CTACCACCAA 3480 

TGAATGGACC ATTTGCTGTT AGTATCATGG TCGTTTCGCC AGAATATGTC TTATCATCTA 3540 

TTGATAATTG ATAATTAAAT TGTGTTGGAT TTAGCAGTGT TTTGACAGTT GATCCAATAT 3600 

45 AACTCAATTT ACCAAATATA TCTTTTGAAC CATCTT6TAC GTTTTCAGCG TTTTGAACAA 3660 

TGAGACCTAA GCCAACAAAG TTGAGTGCAT ATTGATTATT TATTTTAATT ACATCGTATG 3720 

TACCAACTTG TGCAGAAATC ATTTGTTCAC TAGCTTGTTT ATGATTAGGT GCTATATTTA 3780 

GCGTTTTTGT AAAATCATTA AAAGTACCGC CTGGTAAAAT GCCAATAGGG AGTTGAAGGT 3840 

CATGTGTCAT AACACCGTTT ATAAGTTCGT TAACCGTGCC ATCACCGCCA AGAATAAATA 3900 
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(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14078 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dotible 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

TGGACTATTA ACGGCGaAGA AGATTTAACG AAATACTTAC AAACCAATGT TGATGGTATT 60 

ATCACAGATG ACCCAGCATT AGCTGATCAG ATTAAAGAAG AAAAGAAAGA CGAAACATAC 120 

IS 

TTCGATCGTT CTATAAGAAT TTTGTTTGAA TAATATAAAC AAAGACCTCT AAAGTTATCA 180 

AGATGATACC TTCAGAGGTC TTTTTAATGT TGCCATCTAT GGGATAGGCA ATCGTTTCAT 240 

20 TCGTTTATAT TCATATGACA AGTATTTGTA TGGCAATTTG GCGTCACAAA CACTTACATG 300 

ATTTATTGGT GAATTATTAA TTGTTTTGTG AATGCAAAGG GTTAGAAATT GAATTGTAAA 360 

TACTTTCTAA TCTTTGTTTC GCTTTAGTCA TTTGATCCAA ATTTTTAGTG CGTATAGOGG 420 

2S ATTTTGCAAT ATAGTGCGCA CTAAAATATC GCGTTTTTGA AACGCATCTA AATTTAGGTA 480 

CGATAATTTA TTTAAGTCAG TGTTTGCTAT TAATTCATGT AATTGATCTA CAAGCGCTTG 540 

ATGTTGATAC GTATGTGATG TAGTTTCAGA TTTGCTTGCT AATTTAATAC CAGTCGTATC 600 

AAGGAGCGCC GCTTTAATAC CAGCAACTAA ATATGTTTTG ATTTTCATTT GTGTTGTCAT 660 

GCTTTGTTAC TCCTTTGATG TACATTAATC AAAAAAATTA TACACTATTG TATATTGCAA 720 

AGCTAATTAA CTATAACAAA AAGATAGTTA ATGCTTTGTT TATTCTAGTT AATATATAGT 780 

35 

TAATGTCTTT TAATATTTTG TTTCTTTAAT GTAGATTGGG CAATTACATT TTGGAGGAAT 840 

TAAAAAATTA TGAAAAAGCA AATAATTTCG CTAGGCGCAT TAGCAGTTGC ATCTAGCTTA 900 

TTTACATGGG ATAACAAAGC AGATGCGATA GTAACAAAGG ATTATAGTGG GAAATCACAA 960 

40 

GTTAATGCTG GGAGTAAAAA TGGGACATTA ATAGATAGCA GATATTTAAA TTCAGCTCTA 1020 

TATTATTTGG AAGACTATAT AATTTATGCT ATAGGATTAA CTAATAAATA TGAATATGGA 1080 

45 GATAATATTT ATAAAGAAGC TAAAGATAGG TTGTTGGAAA AGGTATTAAG GGAAGATCAA 1140 

TATCTTTTGG AGAGAAAGAA ATCTCAATAT GAAGATTATA AACAATGGTA TGCAAAITAT 1200 

AAAAAAGAAA ATCCTCGTAC AGATTTAAAA ATGGCTAATT TTCATAAATA TAATTTAGAA 1260 

^ GAACTTTCGA TGAAAGAATA CAATGAACTA CAGGATGCAT TAAAGAGAGC ACTGGATGAT 1320 

TTTCACAGAG AAGTTAAAGA TATTAAGGAT AAGAATTCAG ACTTGAAAAC TTTTAATGCA 1380 
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GTTGTATCAT ATTATGGTGA TAAGGATTAT GGGGAGCACG CGAAAGAGTT ACGAGCAAAA 1500 

CTGGACTTAA TCCTTGGAGA TACAGACAAT CCACATAAAA TTACAAATGA ACGTATTAAA 1560 

AAAGAAATGA TTGATGACTT AAATTCAATT ATTGATGATT TCTTTATGGA AACTAAACAA 1620 

AATAGACCGA AATCTATAAC GAAATATAAT CCTACAACAC ATAACTATAA AACAAATAGT 1680 

GATAATAAAC CTAATTTTGA TAAATTAGTT GAAGAAACGA AAAAAGCAGT TAAAGAAGCA 1740 

GATGATTCTT GGAAAAAGAA AACTGTCAAA AAATACGGAG AAACTGAAAC AAAATCGCCA 1800 

GTAGTAAAAG AAGAGAAGAA AGTTGAAGAA CCTCAAGCAC CTAAAGTTGA TAACCAACAA 1860 

GAGGTTAAAA CTACGGCTGG TAAAGCTGAA GAAACAACAC AACCAGTTGC ACAACCATTA 1920 

GTTAAAATTC CACAGGGCAC AATTACAGGT GAAATTGTAA AAGGTCCGGA ATATCCAACG 1980 

ATGGAAAATA AAACX3GTACA AGGTGAAATC GTTCAAGGTC CCGATTTTCT AACAATGGAA 2040 

CAAAGCGGCC CATCATTAAG CAATAATTAT ACAAACCCAC CGTTAACGAA CCCTATTTTA 2100 

GAAGGTCTTG AAGGTAGCTC ATCTAAACTT GAAATAAAAC CACAAGGTAC TGAaTCAACG 2160 

TTAAAAGGTA CTCAAGGAGA ATCAAGTGAT ATTGAAGTTA AACCTCAAGC AACTGAAACA 2220 

2s ACAGAAGCTT CTCAATATGG TCCGAGACCG CAATTTAACA AAACACCTAA ATATGTTAAA 2280 

TATAGAGATG CTGGTACAGG TATCCGTGAA TACAACGATG GAACATTTGG ATATGAAGCG 2340 

AGACCAAGAT TCAATAAGCC ATCAGAAACA AATGCATATA ACGTAACAAC ACATGCAAAT 2400 

30 GGTCAAGTAT CATACXX5AGC TCGTCCGACA TACAAGAAGC CAAGCGAAAC GAATGCATAC 2460 

AATGTAACAA CACATGCAAA CGGCCAAGTA TCATACGGAG CTCX3TCCGAC ACAAAACAAG 2520 

CCAAGCAAAA CAAACGCATA TAACGTAACA ACACATGGAA ACGGCCAAGT ATCATATGGC 2580 

GCTCGCCCAA CACAAAACAA GCCAAGCAAA ACAAATGCAT ACAACGTAAC AACACATGCA 2640 

AACC3GTCAAG TGTCATACGG AGCTCGCCCG ACATACAAGA AGCCAAGTAA AACAAATGCA 2700 

TACAATGTAA CAACACATGC AGATGGTACT GCGACATATG GGCCTAGAGT AACAAAATAA 2760 

GTTTGTAACT CTATCCAAAG ACATACAGTC AATACAAAAC ATTACGTATC TTTACAACAG 2820 

TAATCATGCA TTCTATGATG CTTCTAACTG AATTAAAGCA TCGAACAATC GGAAGCATAT 2880 

TTCTAAATTA TTTATTCATT ATAGTCTTAA ACATAACATG ACCTAATATA TTACTAACCT 2940 

ATTAAAATAA ACCACGCACA TCTAAGTGAT ATACGACAAT CACAGCAATA ATAATTGCTT 3000 

TAGAAAGTCG TGCCGAACTG GAACTTACAA GTCTAGTTCG AACACACACT GATGTGAGTG 3060 

SO GTTTTCTTTA TTTTAAACAT GAACAATCAG ATAAGTTACT AGCATTAGCA AATATTATTA 3120 

AATCAAAGGG CTTCGATTCA TAAAATTTAA AACAATGATT AAAATTAGAC GTGTAAATGT 3180 
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TATTTCACAC AGCTTCATTA ATAAAACGAA ATTGCTTCAA CGCGCTTCAA CTTCAACTGG 3300 

CTTCAACTTC AGCCTACTTC ATTCAATAAC AAAACGAATC CGCTTCATCC AAAATCAACC 3360 

ATTCTAACGC ACATATTCAA ATATAGCAGC TGCACCCATG CCGACACCAA TACACATCGT 3420 

AACCATGCCG TAACGGCTAT CGGGACGTCT ACCCATTTCA TTAAGTAAAC GCGCGGTTAA 3480 

CATTGCGCCT GTAGCACCTA ATGGATGACC TAAAGCAATA GCGCCACCAT TCACATTCGT 3540 

ACGTGATATA TCTAGACCTA CTTCTTTAAT AGATGCAATC GTTTGAGAAG CAAATGCTTC 3600 

GTTCAATTCG ATCAAATCAA TGTCTTCAAC AGATAGATTG CTGAGTGACA ATACTTCAGG 3660 

AATCGCATAT GCAGGCCCAA TACCCATAAT TTTCGGGTCA ACGCCTACTG CCTTAAAACC 3720 

AACGAATCGT GCAATAGGTG TCACGCCGAG TTCTTTCACT TTATCTCCAG ACATTAAAAC 3780 

TACAAATCCT GCACCATCAG AAAGTGGGGC AGATGTTCCT GCAGTCATAG TGCCGTCAGC 3840 

TTTAAATACT GTACGTAATT TGGCTAATGC CTCCATCGTG GTGTCAGGGC GTATAAATTC 3900 

ATCTTGGTCA AAGATATTTG TGTGTACTTT TGGTCCTGCG TTTGTATATT CAACTGAGTT 3960 

TACTTGTATT GGAATAATTT CATCTTTGAA CCGACCATCA CGTTGTGCGT CATAGGCACG 4020 

25 TTGATGACTT CTGACAGCAT AAGCATCTTG ATCTTCGCGT GATACGTCAA ATTGGGATGC 4080 

TACATTTTCA GCAGTTAAAC CCATAGGATA TGACGCACCT ATATCATCAT ATTGTAAGGT 4140 

TGGATTGTTT GTGGGCTCGT TGCCACCCAT TGGTACGGCA CTCATCAATT CAACGCCACC 4200 

AGCTACAAGT ATATCTCCTT GACCAGCCAT AATTTGATTO GCTGCAATCG CGATGGTTTG 4260 

TAATCCTGAT GAGCAGTAGC GATTCACTGT TTGACCCGGT ACCGTGTCAG ATAATCCCGC 4320 

ACGCAATGCA ATCGTTCGTG CAATGTTTTG GCCTTGTAAT CCTTCTGGAA AAGCCGTACC 4380 

AACAATGACA TCTTCAATCA TATTCTTATT GAATTTTCCG TCAATACGTT TCAATACGCC 4440 

TTGTAATACT TTGGCTGCGA CATCATCAGG TCTTTCGTGG AATAATGCGC CTTGCTTTGC 4500 

TTTCGCTGCG GCTGAACGCC CATAAGCTAC AATGTATGCT TCTTGCATGG TTATCATCCT 4560 

CTCTTAATGA CTATCTTTTA ATTACGTAAT GGCTTACCAG TTTTTAACAT ATGTGCAATT 4620 

CTTTCATATG ATTTrTTAGA TTTTAGTAAG TCAATAAAGC CAATTTTCTC CAACGATTGA 4680 

ATGTAACGTT GATTGATAAA TGTATTTCTT GGTAAATCAC CACCCGCTAA AATTGTGGCG 4740 

ATATTTAAGG CAATATGATA ATCATGGTCG CTAATAAAAT GACCCCGTCT TTGC6CATCT 4800 

AATTGTCCTT GGATCAATGC TTTGAAGTCT TCACCTAAAG CGATATATTG ATGTCTAGGA 4860 

SO TTCGGAATAT AGTTTGTTTC TGCTTCATAT TTCGCACGTT TGAGCGCAAC TTCGACACX3T 4920 

TGT6CTGTAT TGAAAATAAT CGTATCTGTA TCACGTAAAT AACCATAACG ACGTGCCTCA 4980 
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TGTTTGTCAT CAAACTTATG 


CGATGTGCXyr 


AATATGCGAT 


CAGCCATTTC 


TGCAAGGCCA 


5100 




CCGCCACTCG GTAATAAGCC AACACCTGCT 


TCAACAAGAC 


CGATATATGT 


TTCACTTGCA 


5160 


5 


GCGACAACAA TAGGTGAGTA 


AAGTACAAGC 


TCACAGCCAC 


CGCCTAAGGC ACGACCTTGA 


5220 




ACAGCTGTGA CTACTGGTTT 


CAAACTATAC 


TTCAAACGAT 


TAAAGCTATA ATGTAATTTA 


5280 




TCAATTGATT GTGCAACGAC 


ATCATCTACA AGACCGTCTT 


CATGCGCCTT 


TTTCATTAAG 


5340 


10 


AAAAGGTTAG CACCCACACT 


GAAATTGTTA 


CCATCTGCAT 


AAATAACCAT 


ACTTGTGTAA 


5400 




TGGTCATTTT CCAGTAAATC 


AATCGCATCA ACTAACGCAT 


CGTTGAATTC 


ATCGGTAATG 


5460 




ACATTATTTT TACTTTGTAA 


TTTCAGTAAC 


AGTTGATCAT 


CATGAGTTAC 


GGAAAGTTTG 


5520 


75 


GCATCACCTT TATCCCAAAG 


TTCATCTTTT 


ACGAAGTGAG 


AAATAGGTGT 


TGCATATTCA 


5580 




ATGGTCTCAT CTTGTTTATA AAAGCCACCA 


TCTAAATCAC 


TAATCCATTG 


TGGTAAGTCT 


5640 


20 


CCAAGTTCGT CTTCCATACG 


TGTTTTAACA 


CGTTCGTATC 


CCATTGCATC 


CCATAATTGG 


5700 


AATGGACCAA GTTTCCAGTT 


GAACCCCCAG 


ACAAGCGCAC 


GGTCTATGTC 


TCX3GAAATCA 


5760 




TCGGTAGCTT TAGGTACATT 


GATAGCAGAG 


TAATAGAAAT 


TATTACGTAA TGTCTCCCAT 


5820 


25 


AAAAATAGTC CCGCTTCGTC 


TTGCGCATTG 


AATATGGTAT 


CAAGGTTATG 


CACTAAGTCT 


5880 




TTATTAAATT CATTTAAAAT 


TGGTAATTGT 


GGTTGCGATA 


CAGGTACATA 


ATCTTGTTTT 


5940 




TCAACATCGT AAACAAGTCG 


AGCTTTAGTT 


TCTTTATCCT 


TTTTGTAAAA TCCTTGTTTC 


6000 


30 


GTTTTACGTC CGAGTGCGCC 


ATTGTCAAAC 


AACGTATTTA 


CAATTTTGAC 


ATCATGAAAA 


6060 




TAAGGTGTTT CTTCAGGTAC 


TTGTTGCATG 


CCTTTAATTA 


CAGACACTGC 


AATATCTAAA 


6120 




CCGACTAGGT CAGATAGCGC 


ATATGTACCT 


GTTTTAGGAC 


GACCAATCGC 


TTGCCCAGTT 


6180 


35 


AAAGCATCCA CATCTACAAT 


GCTTATCTTG 


TGTTGCTCGG 


CGCGATACAT 


AATATCATTC 


6240 




ATTGlTTTGCG tgccgactct atttgcgaca aagccaggca 


CATCATTGAC 


GACAATGACA 


6300 


Aft 


CCTTTACCTA ACACATTTTG 


CGCGAAATTT 


TTTACATCTA 


ATATAATAGA 


TTCCTTCGTG 


6360 


tgtgacgtag gtattaactc 


CACTAATTTC 


ATAATACGTG 


GTGGGTTAAA 


GAAATGTAGA 


6420 




ccaaagaatc gttcttgatc 




AATGCTTGAG 


CAATCGCATT 


AATTGGAATA 


6480 


45 


CCTGATGTAT TTGTAGCGAA 


TAAAGCATCT 


TCTTTAGCAT 


GTTGTAGAAC 


TTGTTGCCAA 


6540 


ACAGCATGCT TAATTTCAAT ATCTTCTTTG ACTGCTTCXyV TATATAAATC AGCATCATCA 


6600 




TTTACCAAGT CATCATCAAA ATTACCATAT 


GTTAAATGAC 


TCGCTAGATT 


TAAGTCGAAT 


6660 


SO 


AGTAGCGGCC GTTTCTTATC 


TGTAATTTTA 


TCGTAAGATT 


TTTTCGCAAT 


GAGATTTGGA 


6720 




TCGTTTTTGT CCACTACAAT ATCTAATAGT 


TTTACTTTAA 


GTCCAGCATT 


CACAAAAAGT 


6780 
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GTGATTCCTC CAATTTAGTT GAGGATAAGA TAACCATTAA GATAATTGGA ATAACGTTGC 6900 

TATTTTATAA AATTAATTAA GTATCTTTGA CAGTCATCTT AGCCTCTTAT TTAAGGAAAA 6960 

^ AGCTTTATGC TTAAAATAAG TCTTTTTTAG TGAAATTAAT GCATCTCATA TAATTATTTG 7020 

CTATTTATAC GAAAGCAGAA TCTCCAGTCA AAGCGCGTCC AATTACTAAG GCATTAATTT 7080 

CATGTGTACC TTCGTACGTG TAAATCGCTT CTGCATCAGA GAAGAAACGT GCAATATCAT 714 0 

10 

AATCGTCAGC TAGTATGCCA TTACCACCTG TAATACCGCG GCCCATAGCT ACTGTCTCAC 7200 

GCAAACGTAA GGCATTCATC ATCTTCGCCO TTGAAGTTGC AACCTCGTCA TATTCACCAT 7260 

GTGCTTGCAT ATTAGCTAAT TGAGCACATG TTGCCATTGC TTGAGCTAAA TTACCTTGCA 7320 

IS 

TCATTGCTAG CTTTTCnTGT ATTAACTGAT ATTTACTAAT TGGTTTGCCG AATTGcTTAC 7380 

GCTCAGTGAC ATAATCTAAT GTGGCACGTA AAGCGCCAGC CATACCACCT GTAGCCATAT 7440 

AAGCAACGCC TGCTCTCGTT GAATAAAGAA TTTTGGCAAT ATCTTTAAAG CTTGTTATGT 7500 

20 

TTTGTAAGCG ATCCGCTTCA TCTACTTTGA CATTAGTTAA TTTAATTAGG GCGTTAGGAA 7560 

CAATGCGAAG TGCGATTTTA TTATCAATGA CTTCAATATC GACGCCATCT TGTTCTGGTC 7620 

25 TGACTACAAA GCAATGGGGT TTGCCAGTTT CTTTATTTAC TGCGAATACT GGAATGACAT 7680 

CAGATACATG TGCACCACCA ATCCATTTCT TTTCACCATT GATAACCCAA GTATCGCCTT 7740 

GGCGTTCAGC GACTGTTTCA AGACCTCCCG CAACGTCCGA ACCGTGTTCT GGTTCAGTTA 7800 

30 AAGCAAAGCA TGTACGCACT TCATGTGACT GTAATTTAGG TACATATTTC GCAATTTGTT 7860 

CTTTGCTACC TCCGAAATAG AAAGTGTTAT GCCCTAAACC TTGGTGAACA CCGAGTAGGG 7920 

TAGCTAAGGA AATATCAAAT CGCGCGAGTA GGTAAGACAT GAAAAACTQA AATAGTTGAC 7980 

35 

TAGGCATTTT GGCX3TTTGGA CGATCCTTGT AAAGTAATGG ATTGTTAAAA TAATTTAATT 8040 

CTCCCAGATC TTTAAAATAG TCCTCGGGTA CAGTAGCGTC TATCCAATGT TGATTAATAT 8100 

TTTCACGGTA CTTACTTTCT AGCAATGAAT CTACTTGTTG TAAAAATTCG ACTTCACCGT 8160 

40 

CTGTTAAACC TTTAGCAATA CTAAGTACAT CTTCAGGAAA TAATGTTTTT AAGACCGTTT 822 0 

CTTTTTCAAA TGTCATATAA ATTCCTCCTA AAAATAATAT GAATACTAAT GTGAAATGCA 8280 

TTTAATTCAA AAACAACACG CTTTATTTOT AAACGCTTAC ACTAAATGTC AAAAATTTTT 8340 

45 

ATCACCTTTA AAGTGTTTGC GAGACTTTGT CATTCATCAT TTGTCGAATC GCAAGTTTAT 8400 

CTGGTTTCTG CGTACTGTTT AACXX5CATAT GTGTCACTGG TACATACATT CTTGGGACTT 8460 

50 TATAACCTGC TAAACGACTt CGCATATGTT GATTTAAAAT TTCAGCGTAA TGAGGTTCAT 8520 

CTTCGCGAAG TATAATGGCT GCAGCAATTG ATTCACCATA TTTTGGATGA TCATAGCCAA 8580 
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AGACATTTTC GCCACCAGTT ATGATTAATT CTTTTTTGCG GTCAATAATA AATATATCGC 8700 

CATCGTTGTC CATCTTCGCT AAGTCACCAG TTAATAAATA TCGACCATGA AATGCTTTGG 8760 

^ CAGTCTCTGC TGGTTTATTC CAATATCCTG GCGTGACATT TTTAGCCTTA ATTOCAAGTT 8820 

CGCCAATCTC ACCAGTAGGT ACTTCCTCAC CGTTATCATC AAGGATACGT GCATCAACGA 8880 

ACATGACTGC TTTACCAATA CTCATTGGCT TACGTTTTGA ATTTTCCGGT GTATTAACAA 8940 

10 

GTACAAGAGG TGCTTCAGTT AAACCATAGC CGTTAATAAT GTTTATGCCA TATTGTTTAA 9000 

AAGCTGCTTG GATACTTGGT AATGGTTGTG AACCACCTTG GATGATATAA TCCATAGCTC 9060 

TAAAATTTTC AGGATTAAAA TTACTAGCAC GTAGCX3TACT ATAATACATT GTCGGAATCA 9120 

IS 

TGATAATAAA TGTAGGGTGA TATTGTGCAA TCATGTCATT CAATTCTTCG CCGTTAAAGT 9180 

AACGTTGAAG AATAAGTGTG CCACCTGACA TTAATACTGG TAATACAGTA TCGTTAAACC 9240 

CTAAAACATG GAACATTGGT GTTGATACAA TCGTAATATA GTTTGAATTG AACTTATACG 9300 

TCAGCTCTAA GTTTGCACCG TTATGAACAA ATGATTCATA TGAGAACATC ACACCTTTAG 9360 

GTGATCCXX5T TGTACCACTT GTATAAATTA ATGCTGCAAG ATCTTGTGGT TCAACAGGTG 9420 

2S TTGCTTGAAA AGGTTGGTGA TAATCTGGAT TTACGATTTC ATCATATTGC GCCACATCAA 9480 

TATCCATATG CAATAAGTTT TGGTCAATAT CGGTGAGTGA ACTTAAATGT TTTTCAGCAT 9540 

AGAAGAGCAG TTTTAATTGT GCATCTTCCA CAATGGCTGC AATTTCTTTT GGGTTAAGCC 9600 

GCCAATTCAA TGGTAAAAAA ACCGCACCTG TTTTAAAACA AGCAAACAAT AAATCTAATA 9660 

TTGCAATATC ATTTGGCGCA AAAATACCGA TAACATCGCC TTTTTTAACA CCTTGAGATG 9720 

TTAAATAATG TGCCATATTA TCAGCGCGTG CATTGAGTTG TTGGTATGTC CAAGATGTTT 9780 

35 

GTTTTGCGTG ATCAATAACG GCAGGCTTGT CATCATCX3AA GTCTGAACGC GTTTTTATCC 9840 

AATGGAAATT CATTAGTATA CCCCCTTTAG CTTCACTTTC ATACTTTATG AATTGATTGT 9900 

TTAAGTTGTC CCCATTTTTC TTTGTAAATG CTGGTATCAA TTAATTTTAA ATGATCAGCA 9960 

40 

ATAATTGGTT TAAAAGCCAT TTGATTCAAA ATATCTTTAT GCAAATCAAG ACCTGGTGCA 10020 

ATTTCAATTA GTTTCAAGCC TTGATTGGTG AGTTCGAATA CTGCACGATC AGTAACAAAA 10080 

TAGATTTCTT GCTCGAGTGA TTGTGAATAT TGTGCATTAA AGTCGATATG GCTCACATCT 10140 

GATACAAATT TCTGGTTTTG TCCTTCAGTT TCAATGTTTA ATCGTTGATT ATGGCATGAG 10200 

ACATGACTGC CAGCTACAAA AGTACCTGAA AAGATAATTT TATTTACAGA TTGCGTAATG 10260 

SO TCTATAAAGC CACCACATCC ATTTAGTCGG TCATTGAAGT AAGACACGTT GACATTGCCG 10320 

TATTGATCAA CCTCAGCAAA GCTAAGATAG GCAACTGATA CACCATTGTT ATAAATAAAA 10380 
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CGACTCCCAA CGAATCCACC GAAAATGCCA ACATCTAAAA TCGGTTGCAC ATCATGTTCA 10500 

ACACATTCTT CATGCAATAA ATTAGAGAGT TCATTATTGA TGCCATAACC GATGCTAATT 10560 

^ GTATCGCCAT AAGTTAAAAA CTGAGCAGCA CGTCGGAGAA TCAATTTGCG ACTATTAAAA 10620 

GGTAATGCGG GTTCAGGTAT TCCATCAATT CGTTCTTCTC CAGACAAGGC TGGTAAATAA 10580 

TGACTTTGAA TTACTTGGCG GTGATTCTTT TCATCTTCTG TGACGTATAC ATAATCGACA 10740 

10 

AGATTTCCTG GGATAACAAC TTCATTCGGT TTTAGTTGAT AGTCGTCAAC TAAAGCTTTA 10800 

ACTTGTACAA TAACTTTCCC ATGATTGGCT TTCGCGTTTA ATGCGACATG ATAACACTCG 10860 

CTCAAGTACG CTTCTTGAGT TAAATAAATG TTACCTTGTT GATCTGCGTA TGTTCCTCTC 10920 

IS 

AGTAGTGCCA CATCAACGCT AGGGAATGTG TAATGTAAGT ATGTTTCATC GTTGATGGTT 10980 

ACTAATGAAA CTAAATCATC CXrTTGTTCGT GTATTTACTT TACCGCCACC GTATCTAGGA 11040 

2^ TCAACAGCTG TGTTTAATCC GATTTTAGTA ATAACTCCAG GTAATAATTG ATTACTCTGA 11100 

CGATAATGAG TTGCAATGAT ACCTTGTGGT AAAAAATAAG CTTCAATGTC ATTATTTTTC 11160 

ATTGCTTGTG CCGTTTTGGA AGAAGCCGTT AAAATACTCA TAATGACACQ TTTAATCATG 11220 

25 CGACGTTCTA TAAAATCATC TAAATCCGGT GCGGCACCTA AACTATGAAT ATCATTCGCT 11280 

AATATAAACG TTAAATCATT GGGCGTATGA TATGTGTCAT GTTGCGCTAA CACAGCACGT 11340 

AGAACTTCGG CGGGTAAGTT GGCTACAGCT AATGCTGGTA AACCAATCAC ATCACCATCT 11400 

30 TTAATGATAT GTTGTAAGTC GTGCCATGTG ATTTGTTTCA AGCAAGTCAC CTCCATCACA 11460 

TTTGATAAAA TATAGCGTTT TTACACTTTG TGTAAACCCT TaCAAGAAAT ATAACATAAC 11520 

GACGTTTAAA ATCAATTAGA AATATCTTTT TATTCTGATA ATAGACACAG TATAGACACA 11580 

35 

TTTTGATGGT CGATAACAAT TGTAATATCA AGGGTTTGTA ATGAATTGAA TATCATTAAA 11640 

ATACTTATAT AAAAATATTG TTCGGAATAT AAAAAGTTAA ATAGGTTTTG ATTTTTAAAT 11700 

ATGAAATACA AAGTGCCCAA TCGAACAAAG TATTTATATT AAAATATGGA AAATCCATCA 11760 

40 

ATATTAAATT AAAATAGTTT TATTATGAAA AGTGAAAGTA GGTAAGTCTA TGGAAGGTCT 11820 

TAATCATCGA AGAAATACAG AAAAAGAAGA GACAACACAA ACGCAATCaG TTGCACCTAA 11880 

TACAGGTGAA GAGGGGATGT CATCAGCAA6 TACACAATCA ACTAAGACGT CCGACATACA 11940 

45 

TAATGAATCT ATCGATAAAC AAATGGAAGC TAAAGCGCAT GAAACAGCGC AAAATACAGA 12000 

TTTAAAAAAC GAAGCAAGAA GTTTATTTGA TAATGCAACC AAATCAATCG GTAGACTAGC 12060 

SO GGGCAATGAT GAAAGCTTAA ATCTTAATTT AAAAGATATG CTTTCTGAAG TATTTAAGCC 12120 

GCATACTAAA AACGAAGCAG ATGAAATATT TATAGCGGGT ACTGCTAAAA CTACGCCAGC 12180 
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TTTCACAGTA ACATTTATTG GATTATGGGT CATGGCAGCA ATTTTTAATA ACACTAACGC 12300 

GATTCCGGGT CTCATTTTTA TAGGGGCTTT AACAGTACCA TTATCGGGTT TGTTCTTCTT 12360 

5 TTATGAATCA AATGCXSTTTA AAAATATTAG CATTTTTGAA GTTATTATCA TGTTCTTTAT 12420 

TGGCGGCGTA TTTTCATTAC TAAGTACGAT GGTATTATAT AGATTTGTCG TTTTTAGTGA 124 80 

TCAATTCGAA AGGTTTGGTT CTTTAACATT TTTCGATGCA TTTTTAGTAG GATTAGTTGA 12540 

10 

AGAAACTGGA AAAGCACTCA TTATTGTTTA TTTCGTCAAT AAATTGAAAA C7AATAAGAT 12600 

TTTGAATGGA TTATTAATCG GTGCTGCTAT TGGTGCAGGG TTCGCAGTTT TTGAATCAGC 12660 

AGGTTATATT TTGAATTTCG CTTTAGGAGA AAATGTCCCA TTATTAGATA TTGTCTTCAC 12720 

IS 

ACGTGCGTGG ACTGCGATTG GTGGTCATTT AGTTTGGTCA kCGATTGTTG GTGCTGCAAT 12780 

AGTTATTGCX3 AAAGAACAGC ATGGCTTTGA ATTCAAAGAT ATTTTTGATA AACGCTTTTT 12840 

AATATTCTTT TTATCAGCCG TTGTTTTACA TGGCATTTGG GATACATCTT TAACTGTACT 12900 

20 

TGGCAGTGAT ACGTTGAAAA TATTTATTTT AATCGTTATT GTGTGGATAC TTGTATTCaT 12960 

TTTAATGGGG GCAGGTTTAA AACAAGTGAA TTTACTGCAG AAAGAATTTA AAGAACAACA 13020 

25 GAAAAAAGTA GACGAATAAT AATTAAAGCT TATaTTOCTC ATATGTTTGT GACATAAGCT 13080 

ATTTTTATAA TTTGTCTTTA AAAGAGTGGA ATAGGAATAC TTTTTGGAGT TAAAAAAGTG 13140 

TTtCACGTTA AACAAATAGT GACAATTAGA TTTATATAAA ATGAACATGA TTCACTGAAA 13200 

30 GTATGTAATA ATCATTTTAT TGAAATTCAT CAAACAGAAA TTAATACAAT CATATAAGCA 13260 

AATTAAACCA CGCCATAATC ATATTGGATG ACTTCGGCGT GGTTTTTATA GTTGAAGCAG 13320 

GGCTGAGACA TAAATCAATG TCCCACACTC CCTTATCGTT CAATCGTTGT TCGATAATCG 13380 

^ ATTAAATAGA TACCTTCAGG TGTTACTTTA TAATTTTTAA CCTTAGAGTT AGCAGCGACT 13440 

ATTTSSATCGT TGTAAGCAAT ATAACTGTTT GGTACATCTC GACTTGATAA TTTAATAATA 13500 

TCAtTAGAAA TATTGTGACG TTCCTTAACA TCTACAGTAT GATTCAATTG ATTAATTAAA 13560 

40 

TCATCGACGT TGCTATTATT GTAGTCTCCT TTATTAATAG CACCATCTTT TTTATATGCT 13620 

TGATTAAAGA AATAACCTGT ATCTCCACGA GGAATTGTTC CGAAACTATA CATCGTTGCA 13680 

TCCCATGCAG AACGGTCTTT TAAGTAACCT TCTATGTCAT CAACACTTTT AATGTCGATT 13740 

4S 

TCAATATTTG CTTTTTTAGC ATCTGATTGT AATACTTGCG CAATTTTCGA TAGCTCTGGA 13800 

CGACCGTCAT ACGTAATTAA CTTAATTTTT AAAGGGTGTT CTTTTGTATA ACCATCTTTA 13860 

so GCTAATAACA TTTTTGCTTG TTCGATATTT TGTTTGGTTA ACTTAGGTTC TTTAATATAT 13920 

GGAATTTTAT CATTAAATGG ACTCGTTGCA GGTTTOGCAT AACCTTOATA AATATGATCT 13980 
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10 



IS 



30 



TTATtAGTAT GATTATACAT AAGTaAGAAG TTCTAAAn 14078 
(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 86 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

TGAAAACTAA AGTGTTTCTA ATGCGTGACT AAAATTAGTA ATAATTAAGT TCTCATGATA 60 

ATAGGTATTT TTGAAAAATG GAGGAGTCTA TAAATGGGTA AAAAAATGGG TCTAGGTTTA 120 

TCTATTGCAT TGGTTGTTAT TGGTATTGCC GTTGTATGTT TAATGATTTT TTCTAGTCAA 180 

20 AAAACGACTT ATTTTGGTTA TATGAATAGT AATACAAATG CAQAAAAAGT TGTCAGTQAA 240 

AAAGATGGAT TAGTCAAACA TAATATCAAA GTAGAACCAT CTAATGATTT CAAGCCGAAA 300 

AAAGGA6ACT TTGTAAAATT AGTTTCTAAA GATGATGGGA AGACATTTTA TA/U^CAAGAG 360 

25 ATTGTTAAAC ATGATGACGT CCCACACGGT TTAATGATGA AAATTCACGA CATGCATATG 420 

AATTAATAAA AAAGCATCTA TAACGTAATT TTGAAGAAGT AGAGTTATCT TCTTATGCGT 480 

TTTAGA 4 86 
(2) INFORMATION FOR SEQ ID NO: 193: 



U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1626 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

40 

GAGGTCTATA TACAATTATG GTTGTTCCAG TTAAACGAAC TGATGGCTTT ATTACTAAGT 60 

TTAATAGATT AATTGAAAGA CGATTATTAC GTCATTTCAG TAAAAAAGGT TATATCACAT 120 

4S GGGAGGAAAA TTGATTGTCT GACATTTTAA AATGTATCGG TTGTGGTGCQ CCACTTCAAT 180 

CTGAAGATAA AAATAAACCT GGTTTTGTAC CAGAGCATAA TATGTTTCGT GATGACGTGA 240 

TTTGCAGACG TTGTTTCCGC TTGAAAAATT ATAACGAATT CAA6ATGTAG GATTAGAAAG 300 

^ TGAAGACTTT TTAAAATTAT TATCAGGACT TGCGGATAAA AAGGGTATTG TCGTCAATGT 360 

CGTGGATGTA TTTGACTTTG AAGGATCATT TATTAATGCA GTTAAACGTA TTGTCGGAAA 420 
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